Holo1.5 Collection Holo1.5 - Open Foundation Models for Computer Use Agents • 5 items • Updated Sep 15 • 33
Holo1 Collection Vision-Language Action Model for use in Surfer-H web navigation agent • 6 items • Updated Jun 10 • 48
view article Article Gotchas in Tokenizer Behavior Every Developer Should Know By qgallouedec • Apr 18 • 44
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published Apr 14 • 297
view article Article ViDoRe Benchmark V2: Raising the Bar for Visual Retrieval By manu and 2 others • Mar 18 • 12
view article Article Fine-tune Llama 3.1 Ultra-Efficiently with Unsloth By mlabonne • Jul 29, 2024 • 364
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published Feb 20 • 152
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 243
view article Article π0 and π0-FAST: Vision-Language-Action Models for General Robot Control Feb 4 • 179