--- license: cc-by-nc-sa-4.0 base_model: - Qwen/Qwen2.5-VL-3B-Instruct tags: - robotics - vision-language-action-model - vision-language-model library_name: transformers --- # Model Card for InternVLA-M1 ## Description: **InternVLA-M1** is an open-source, end-to-end **vision–language–action (VLA) framework** for building and researching generalist robot policies. The checkpoints in this repository were pretrained on the system2 dataset. - 🌐 Homepage: [InternVLA-M1 Project Page](https://internrobotics.github.io/internvla-m1.github.io/) - 💻 Codebase: [InternVLA-M1 GitHub Repo](https://github.com/InternRobotics/InternVLA-M1) ![image/png](https://github.com/InternRobotics/InternVLA-M1/raw/InternVLA-M1/assets/teaser.png) ## Citation ``` @misc{internvla2024, title = {InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy}, author = {InternVLA-M1 Contributors}, year = {2025}, booktitle={arXiv}, } ```