NexaAI
/

Qwen3-VL-2B-Thinking-GGUF

Model card Files Files and versions

nexaml commited on 25 days ago

Commit

7550715

·

verified ·

1 Parent(s): e359336

Create README.md

Files changed (1) hide show

README.md +51 -0

README.md ADDED Viewed

	@@ -0,0 +1,51 @@

+---
+base_model:
+- Qwen/Qwen3-VL-2B-Thinking
+---
+# Qwen3-VL-2B-Thinking
+Currently, only [NexaSDK](https://github.com/NexaAI/nexa-sdk) supports this GGUF.
+## Quickstart:
+- Download [NexaSDK](https://github.com/NexaAI/nexa-sdk) with one click
+- one line of code to run in your terminal:
+```
+nexa infer NexaAI/Qwen3-VL-2B-Thinking-GGUF
+```
+## Model Description
+**Qwen3-VL-2B-Thinking** is a 2-billion-parameter multimodal model from the Qwen3-VL family, optimized for **explicit reasoning and step-by-step visual understanding**.
+It builds upon Qwen3-VL-2B with additional “thinking” supervision, allowing the model to **explain its reasoning process** across both text and images—ideal for research, education, and agentic applications requiring transparent decision traces.
+## Features
+- **Visual reasoning**: Performs detailed, interpretable reasoning across images, diagrams, and UI elements.
+- **Step-by-step thought traces**: Generates intermediate reasoning steps for transparency and debugging.
+- **Multimodal understanding**: Supports text, images, and video inputs with consistent logical grounding.
+- **Compact yet capable**: 2B parameters, optimized for low-latency inference and on-device deployment.
+- **Instruction-tuned**: Enhanced alignment for “think-aloud” question answering and visual problem solving.
+## Use Cases
+- Visual question answering with reasoning chains
+- Step-by-step image or chart analysis for education and tutoring
+- Debuggable AI agents and reasoning assistants
+- Research on interpretable multimodal reasoning
+- On-device transparent AI inference for visual domains
+## Inputs and Outputs
+**Inputs**
+- Text prompts or questions
+- Images, diagrams, or UI screenshots
+- Optional multi-turn reasoning chains
+**Outputs**
+- Natural language answers with explicit thought steps
+- Detailed reasoning traces combining visual and textual logic
+## License
+This model is released under the **Apache 2.0 License**.
+Refer to the official Hugging Face page for license details and usage terms.
+## References
+- [Qwen3-VL-2B-Thinking on Hugging Face](https://huggingface.co/Qwen/Qwen3-VL-2B-Thinking)
+- [Qwen3 Technical Report (arXiv)](https://arxiv.org/abs/2407.10671)
+- [Qwen GitHub Repository](https://github.com/QwenLM)