TheMCPCompany: Creating General-purpose Agents with Task-specific Tools Paper • 2510.19286 • Published 6 days ago • 5
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM Paper • 2510.15870 • Published 11 days ago • 79
RDT 2 Collection RDT 2, the sequel to RDT-1B, is the first foundation model that achieves zero-shot deployment on unseen embodiments for simple open-vocabulary tasks. • 4 items • Updated Sep 26 • 15
V-JEPA 2 Collection A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of https://ai.meta.com/blog/v-jepa-yann • 8 items • Updated Jun 13 • 166
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics Paper • 2506.01844 • Published Jun 2 • 140
view article Article SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data Jun 3 • 268