Qwen/Qwen3-VL-235B-A22B-Thinking Image-Text-to-Text • 236B • Updated Nov 26, 2025 • 43.3k • • 357
Running Featured 152 DINOv3 Web 🦖 152 Visualize rich, dense image features locally in your browser
The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding Paper • 2512.19693 • Published 17 days ago • 62
Next-Embedding Prediction Makes Strong Vision Learners Paper • 2512.16922 • Published 21 days ago • 83
Fast and Accurate Causal Parallel Decoding using Jacobi Forcing Paper • 2512.14681 • Published 23 days ago • 39
LongVie 2: Multimodal Controllable Ultra-Long Video World Model Paper • 2512.13604 • Published 24 days ago • 73
sensenova/SenseNova-SI-1.1-Qwen2.5-VL-7B Image-Text-to-Text • 8B • Updated about 1 month ago • 1.31k • 3
sensenova/SenseNova-SI-1.1-Qwen2.5-VL-3B Image-Text-to-Text • 4B • Updated about 1 month ago • 1.28k • 2
sensenova/SenseNova-SI-1.2-InternVL3-8B Image-Text-to-Text • 8B • Updated 29 days ago • 3.49k • 10
sensenova/SenseNova-SI-1.1-Qwen3-VL-8B Image-Text-to-Text • 9B • Updated about 1 month ago • 1.41k • 5
sensenova/SenseNova-SI-1.2-InternVL3-8B Image-Text-to-Text • 8B • Updated 29 days ago • 3.49k • 10