view article Article SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data Jun 3 β’ 284
Running on Zero Featured 1.71k Dia 1.6B π― 1.71k Generate realistic dialogue from a script, using Dia!
microsoft/Phi-3-vision-128k-instruct Text Generation β’ 4B β’ Updated Aug 20, 2024 β’ 24.8k β’ 969