--- license: apache-2.0 tags: - sentence-transformers - feature-extraction - sentence-similarity - transformers.js library_name: transformers.js pipeline_tag: feature-extraction --- # Qwen3-Embedding-4B-ONNX This is an ONNX conversion of [Qwen/Qwen3-Embedding-4B](https://huggingface.co/Qwen/Qwen3-Embedding-4B) for use with [Transformers.js](https://github.com/xenova/transformers.js) in the browser. ## Model Details - **Model Type:** Text Embedding - **Base Model:** Qwen3-Embedding-4B - **Parameters:** 4B - **Embedding Dimensions:** 2560 - **Context Length:** 32K - **MTEB v2 Score:** 74.60 - **Languages:** 100+ ## Usage (Transformers.js v3) ```javascript import { pipeline } from "@huggingface/transformers"; // Create a feature extraction pipeline const extractor = await pipeline( "feature-extraction", "dssjon/Qwen3-Embedding-4B-ONNX", { dtype: "fp32", device: "webgpu", // Use WebGPU for acceleration } ); // Format query with instruction const taskDescription = "Given a web search query, retrieve relevant passages that answer the query"; const query = `Instruct: ${taskDescription}\nQuery:What is the capital of China?`; // Generate embedding const output = await extractor(query, { pooling: "last_token", normalize: true }); console.log(output.data); // 2560-dimensional embedding ``` ## Usage (Python - Original Model) ```python from sentence_transformers import SentenceTransformer model = SentenceTransformer("Qwen/Qwen3-Embedding-4B") # For queries query = "What is the capital of China?" query_embedding = model.encode(query, prompt_name="query") # For documents (no prompt needed) document = "The capital of China is Beijing." doc_embedding = model.encode(document) ``` ## Conversion Details - **ONNX Opset:** 14 - **Precision:** FP32 - **Optimization:** None (Qwen3 not yet supported by ONNX Runtime optimizer) - **File Size:** ~15.3 GB ## Performance Benchmark scores from MTEB v2: | Task | Score | |------|-------| | Classification | 89.84 | | Clustering | 57.51 | | Pair Classification | 87.01 | | Reranking | 50.76 | | Retrieval | 68.46 | | STS | 88.72 | | Summarization | 34.39 | | **Mean** | **74.60** | ## License Apache 2.0 (same as base model) ## Citation ```bibtex @article{qwen3embedding2025, title={Qwen3 Embedding}, author={Qwen Team}, year={2025}, url={https://huggingface.co/Qwen/Qwen3-Embedding-4B} } ``` ## Acknowledgments - Base model: [Qwen/Qwen3-Embedding-4B](https://huggingface.co/Qwen/Qwen3-Embedding-4B) - Conversion tool: [Optimum](https://github.com/huggingface/optimum) - Browser runtime: [Transformers.js](https://github.com/xenova/transformers.js)