VST Collection A comprehensive framework designed to cultivate VLMs with human-like visuospatial abilities. β’ 5 items β’ Updated Nov 12, 2025 β’ 6
Cosmos-Predict2 Collection β οΈ This collection is archived. π https://huggingface.co/collections/nvidia/cosmos-predict25 β’ 13 items β’ Updated 8 days ago β’ 33
Cosmos World Foundation Model Platform for Physical AI Paper β’ 2501.03575 β’ Published Jan 7, 2025 β’ 82
Physical AI Collection Collection of open, commercial-grade datasets for physical AI developers β’ 25 items β’ Updated 8 days ago β’ 112
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper β’ 2501.17161 β’ Published Jan 28, 2025 β’ 123
PixMo Collection A set of vision-language datasets built by Ai2 and used to train the Molmo family of models. Read more at https://molmo.allenai.org/blog β’ 10 items β’ Updated Dec 23, 2025 β’ 85
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper β’ 2411.10440 β’ Published Nov 15, 2024 β’ 129
Theia Collection Distilling Diverse Vision Foundation Models for Robot Learning β’ 6 items β’ Updated Sep 30, 2024 β’ 9
view article Article Metric and Relative Monocular Depth Estimation: An Overview. Fine-Tuning Depth Anything V2 π π Jul 10, 2024 β’ 92
3D-VLA: A 3D Vision-Language-Action Generative World Model Paper β’ 2403.09631 β’ Published Mar 14, 2024 β’ 12
Minitron Collection A family of compressed models obtained via pruning and knowledge distillation β’ 12 items β’ Updated 8 days ago β’ 62
OpenResearcher: Unleashing AI for Accelerated Scientific Research Paper β’ 2408.06941 β’ Published Aug 13, 2024 β’ 32
Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks Paper β’ 2408.03615 β’ Published Aug 7, 2024 β’ 31
Achieving Human Level Competitive Robot Table Tennis Paper β’ 2408.03906 β’ Published Aug 7, 2024 β’ 28
Zero-Shot Metric Depth with a Field-of-View Conditioned Diffusion Model Paper β’ 2312.13252 β’ Published Dec 20, 2023 β’ 27