---
license: apache-2.0
tags:
- sentence-transformers
- feature-extraction
- sentence-similarity
- transformers.js
library_name: transformers.js
pipeline_tag: feature-extraction
---

# Qwen3-Embedding-4B-ONNX

This is an ONNX conversion of [Qwen/Qwen3-Embedding-4B](https://huggingface.co/Qwen/Qwen3-Embedding-4B) for use with [Transformers.js](https://github.com/xenova/transformers.js) in the browser.

## Model Details

- **Model Type:** Text Embedding
- **Base Model:** Qwen3-Embedding-4B
- **Parameters:** 4B
- **Embedding Dimensions:** 2560
- **Context Length:** 32K
- **MTEB v2 Score:** 74.60
- **Languages:** 100+

## Usage (Transformers.js v3)

```javascript
import { pipeline } from "@huggingface/transformers";

// Create a feature extraction pipeline
const extractor = await pipeline(
  "feature-extraction",
  "dssjon/Qwen3-Embedding-4B-ONNX",
  {
    dtype: "fp32",
    device: "webgpu", // Use WebGPU for acceleration
  }
);

// Format query with instruction
const taskDescription = "Given a web search query, retrieve relevant passages that answer the query";
const query = `Instruct: ${taskDescription}\nQuery:What is the capital of China?`;

// Generate embedding
const output = await extractor(query, {
  pooling: "last_token",
  normalize: true
});

console.log(output.data); // 2560-dimensional embedding
```

## Usage (Python - Original Model)

```python
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("Qwen/Qwen3-Embedding-4B")

# For queries
query = "What is the capital of China?"
query_embedding = model.encode(query, prompt_name="query")

# For documents (no prompt needed)
document = "The capital of China is Beijing."
doc_embedding = model.encode(document)
```

## Conversion Details

- **ONNX Opset:** 14
- **Precision:** FP32
- **Optimization:** None (Qwen3 not yet supported by ONNX Runtime optimizer)
- **File Size:** ~15.3 GB

## Performance

Benchmark scores from MTEB v2:

| Task | Score |
|------|-------|
| Classification | 89.84 |
| Clustering | 57.51 |
| Pair Classification | 87.01 |
| Reranking | 50.76 |
| Retrieval | 68.46 |
| STS | 88.72 |
| Summarization | 34.39 |
| **Mean** | **74.60** |

## License

Apache 2.0 (same as base model)

## Citation

```bibtex
@article{qwen3embedding2025,
  title={Qwen3 Embedding},
  author={Qwen Team},
  year={2025},
  url={https://huggingface.co/Qwen/Qwen3-Embedding-4B}
}
```

## Acknowledgments

- Base model: [Qwen/Qwen3-Embedding-4B](https://huggingface.co/Qwen/Qwen3-Embedding-4B)
- Conversion tool: [Optimum](https://github.com/huggingface/optimum)
- Browser runtime: [Transformers.js](https://github.com/xenova/transformers.js)