Ban
#3
by
YARGI
- opened
- .gitattributes +0 -1
- Finetune_SmolVLA_notebook.ipynb +0 -214
- README.md +17 -31
- collage_small.gif +0 -3
- config.json +11 -21
- model.safetensors +2 -2
- policy_postprocessor.json +0 -32
- policy_postprocessor_step_0_unnormalizer_processor.safetensors +0 -3
- policy_preprocessor.json +0 -87
- policy_preprocessor_step_5_normalizer_processor.safetensors +0 -3
- train_config.json +196 -0
.gitattributes
CHANGED
|
@@ -33,4 +33,3 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
-
collage_small.gif filter=lfs diff=lfs merge=lfs -text
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
Finetune_SmolVLA_notebook.ipynb
DELETED
|
@@ -1,214 +0,0 @@
|
|
| 1 |
-
{
|
| 2 |
-
"cells": [
|
| 3 |
-
{
|
| 4 |
-
"cell_type": "markdown",
|
| 5 |
-
"metadata": {
|
| 6 |
-
"id": "NQUk3Y0WwYZ4"
|
| 7 |
-
},
|
| 8 |
-
"source": [
|
| 9 |
-
"# 🤗 x 🦾: Training SmolVLA with LeRobot Notebook\n",
|
| 10 |
-
"\n",
|
| 11 |
-
"Welcome to the **LeRobot SmolVLA training notebook**! This notebook provides a ready-to-run setup for training imitation learning policies using the [🤗 LeRobot](https://github.com/huggingface/lerobot) library.\n",
|
| 12 |
-
"\n",
|
| 13 |
-
"In this example, we train an `SmolVLA` policy using a dataset hosted on the [Hugging Face Hub](https://huggingface.co/), and optionally track training metrics with [Weights & Biases (wandb)](https://wandb.ai/).\n",
|
| 14 |
-
"\n",
|
| 15 |
-
"## ⚙️ Requirements\n",
|
| 16 |
-
"- A Hugging Face dataset repo ID containing your training data (`--dataset.repo_id=YOUR_USERNAME/YOUR_DATASET`)\n",
|
| 17 |
-
"- Optional: A [wandb](https://wandb.ai/) account if you want to enable training visualization\n",
|
| 18 |
-
"- Recommended: GPU runtime (e.g., NVIDIA A100) for faster training\n",
|
| 19 |
-
"\n",
|
| 20 |
-
"## ⏱️ Expected Training Time\n",
|
| 21 |
-
"Training with the `SmolVLA` policy for 20,000 steps typically takes **about 5 hours on an NVIDIA A100** GPU. On less powerful GPUs or CPUs, training may take significantly longer!\n",
|
| 22 |
-
"\n",
|
| 23 |
-
"## Example Output\n",
|
| 24 |
-
"Model checkpoints, logs, and training plots will be saved to the specified `--output_dir`. If `wandb` is enabled, progress will also be visualized in your wandb project dashboard.\n"
|
| 25 |
-
]
|
| 26 |
-
},
|
| 27 |
-
{
|
| 28 |
-
"cell_type": "markdown",
|
| 29 |
-
"metadata": {
|
| 30 |
-
"id": "MOJyX0CnwA5m"
|
| 31 |
-
},
|
| 32 |
-
"source": [
|
| 33 |
-
"## Install conda\n",
|
| 34 |
-
"This cell uses `condacolab` to bootstrap a full Conda environment inside Google Colab.\n"
|
| 35 |
-
]
|
| 36 |
-
},
|
| 37 |
-
{
|
| 38 |
-
"cell_type": "code",
|
| 39 |
-
"execution_count": null,
|
| 40 |
-
"metadata": {
|
| 41 |
-
"id": "QlKjL1X5t_zM"
|
| 42 |
-
},
|
| 43 |
-
"outputs": [],
|
| 44 |
-
"source": [
|
| 45 |
-
"!pip install -q condacolab\n",
|
| 46 |
-
"import condacolab\n",
|
| 47 |
-
"condacolab.install()"
|
| 48 |
-
]
|
| 49 |
-
},
|
| 50 |
-
{
|
| 51 |
-
"cell_type": "markdown",
|
| 52 |
-
"metadata": {
|
| 53 |
-
"id": "DxCc3CARwUjN"
|
| 54 |
-
},
|
| 55 |
-
"source": [
|
| 56 |
-
"## Install LeRobot\n",
|
| 57 |
-
"This cell clones the `lerobot` repository from Hugging Face, installs FFmpeg (version 7.1.1), and installs the package in editable mode.\n"
|
| 58 |
-
]
|
| 59 |
-
},
|
| 60 |
-
{
|
| 61 |
-
"cell_type": "code",
|
| 62 |
-
"execution_count": null,
|
| 63 |
-
"metadata": {
|
| 64 |
-
"id": "dgLu7QT5tUik"
|
| 65 |
-
},
|
| 66 |
-
"outputs": [],
|
| 67 |
-
"source": [
|
| 68 |
-
"!git clone https://github.com/huggingface/lerobot.git\n",
|
| 69 |
-
"!conda install ffmpeg=7.1.1 -c conda-forge\n",
|
| 70 |
-
"!cd lerobot && pip install -e ."
|
| 71 |
-
]
|
| 72 |
-
},
|
| 73 |
-
{
|
| 74 |
-
"cell_type": "markdown",
|
| 75 |
-
"metadata": {
|
| 76 |
-
"id": "Q8Sn2wG4wldo"
|
| 77 |
-
},
|
| 78 |
-
"source": [
|
| 79 |
-
"## Weights & Biases login\n",
|
| 80 |
-
"This cell logs you into Weights & Biases (wandb) to enable experiment tracking and logging."
|
| 81 |
-
]
|
| 82 |
-
},
|
| 83 |
-
{
|
| 84 |
-
"cell_type": "code",
|
| 85 |
-
"execution_count": null,
|
| 86 |
-
"metadata": {
|
| 87 |
-
"id": "PolVM_movEvp"
|
| 88 |
-
},
|
| 89 |
-
"outputs": [],
|
| 90 |
-
"source": [
|
| 91 |
-
"!wandb login"
|
| 92 |
-
]
|
| 93 |
-
},
|
| 94 |
-
{
|
| 95 |
-
"cell_type": "markdown",
|
| 96 |
-
"metadata": {
|
| 97 |
-
"id": "zTWQAgX9xseE"
|
| 98 |
-
},
|
| 99 |
-
"source": [
|
| 100 |
-
"## Install SmolVLA dependencies"
|
| 101 |
-
]
|
| 102 |
-
},
|
| 103 |
-
{
|
| 104 |
-
"cell_type": "code",
|
| 105 |
-
"execution_count": null,
|
| 106 |
-
"metadata": {
|
| 107 |
-
"id": "DiHs0BKwxseE"
|
| 108 |
-
},
|
| 109 |
-
"outputs": [],
|
| 110 |
-
"source": [
|
| 111 |
-
"!cd lerobot && pip install -e \".[smolvla]\""
|
| 112 |
-
]
|
| 113 |
-
},
|
| 114 |
-
{
|
| 115 |
-
"cell_type": "markdown",
|
| 116 |
-
"metadata": {
|
| 117 |
-
"id": "IkzTo4mNwxaC"
|
| 118 |
-
},
|
| 119 |
-
"source": [
|
| 120 |
-
"## Start training SmolVLA with LeRobot\n",
|
| 121 |
-
"\n",
|
| 122 |
-
"This cell runs the `train.py` script from the `lerobot` library to train a robot control policy. \n",
|
| 123 |
-
"\n",
|
| 124 |
-
"Make sure to adjust the following arguments to your setup:\n",
|
| 125 |
-
"\n",
|
| 126 |
-
"1. `--dataset.repo_id=YOUR_HF_USERNAME/YOUR_DATASET`: \n",
|
| 127 |
-
" Replace this with the Hugging Face Hub repo ID where your dataset is stored, e.g., `pepijn223/il_gym0`.\n",
|
| 128 |
-
"\n",
|
| 129 |
-
"2. `--batch_size=64`: means the model processes 64 training samples in parallel before doing one gradient update. Reduce this number if you have a GPU with low memory.\n",
|
| 130 |
-
"\n",
|
| 131 |
-
"3. `--output_dir=outputs/train/...`: \n",
|
| 132 |
-
" Directory where training logs and model checkpoints will be saved.\n",
|
| 133 |
-
"\n",
|
| 134 |
-
"4. `--job_name=...`: \n",
|
| 135 |
-
" A name for this training job, used for logging and Weights & Biases.\n",
|
| 136 |
-
"\n",
|
| 137 |
-
"5. `--policy.device=cuda`: \n",
|
| 138 |
-
" Use `cuda` if training on an NVIDIA GPU. Use `mps` for Apple Silicon, or `cpu` if no GPU is available.\n",
|
| 139 |
-
"\n",
|
| 140 |
-
"6. `--wandb.enable=true`: \n",
|
| 141 |
-
" Enables Weights & Biases for visualizing training progress. You must be logged in via `wandb login` before running this."
|
| 142 |
-
]
|
| 143 |
-
},
|
| 144 |
-
{
|
| 145 |
-
"cell_type": "code",
|
| 146 |
-
"execution_count": null,
|
| 147 |
-
"metadata": {
|
| 148 |
-
"id": "ZO52lcQtxseE"
|
| 149 |
-
},
|
| 150 |
-
"outputs": [],
|
| 151 |
-
"source": [
|
| 152 |
-
"!cd lerobot && python lerobot/scripts/train.py \\\n",
|
| 153 |
-
" --policy.path=lerobot/smolvla_base \\\n",
|
| 154 |
-
" --dataset.repo_id=${HF_USER}/mydataset \\\n",
|
| 155 |
-
" --batch_size=64 \\\n",
|
| 156 |
-
" --steps=20000 \\\n",
|
| 157 |
-
" --output_dir=outputs/train/my_smolvla \\\n",
|
| 158 |
-
" --job_name=my_smolvla_training \\\n",
|
| 159 |
-
" --policy.device=cuda \\\n",
|
| 160 |
-
" --wandb.enable=true"
|
| 161 |
-
]
|
| 162 |
-
},
|
| 163 |
-
{
|
| 164 |
-
"cell_type": "markdown",
|
| 165 |
-
"metadata": {
|
| 166 |
-
"id": "2PBu7izpxseF"
|
| 167 |
-
},
|
| 168 |
-
"source": [
|
| 169 |
-
"## Login into Hugging Face Hub\n",
|
| 170 |
-
"Now after training is done login into the Hugging Face hub and upload the last checkpoint"
|
| 171 |
-
]
|
| 172 |
-
},
|
| 173 |
-
{
|
| 174 |
-
"cell_type": "code",
|
| 175 |
-
"execution_count": null,
|
| 176 |
-
"metadata": {
|
| 177 |
-
"id": "8yu5khQGIHi6"
|
| 178 |
-
},
|
| 179 |
-
"outputs": [],
|
| 180 |
-
"source": [
|
| 181 |
-
"!huggingface-cli login"
|
| 182 |
-
]
|
| 183 |
-
},
|
| 184 |
-
{
|
| 185 |
-
"cell_type": "code",
|
| 186 |
-
"execution_count": null,
|
| 187 |
-
"metadata": {
|
| 188 |
-
"id": "zFMLGuVkH7UN"
|
| 189 |
-
},
|
| 190 |
-
"outputs": [],
|
| 191 |
-
"source": [
|
| 192 |
-
"!huggingface-cli upload ${HF_USER}/my_smolvla \\\n",
|
| 193 |
-
" /content/lerobot/outputs/train/my_smolvla/checkpoints/last/pretrained_model"
|
| 194 |
-
]
|
| 195 |
-
}
|
| 196 |
-
],
|
| 197 |
-
"metadata": {
|
| 198 |
-
"accelerator": "GPU",
|
| 199 |
-
"colab": {
|
| 200 |
-
"gpuType": "A100",
|
| 201 |
-
"machine_shape": "hm",
|
| 202 |
-
"provenance": []
|
| 203 |
-
},
|
| 204 |
-
"kernelspec": {
|
| 205 |
-
"display_name": "Python 3",
|
| 206 |
-
"name": "python3"
|
| 207 |
-
},
|
| 208 |
-
"language_info": {
|
| 209 |
-
"name": "python"
|
| 210 |
-
}
|
| 211 |
-
},
|
| 212 |
-
"nbformat": 4,
|
| 213 |
-
"nbformat_minor": 0
|
| 214 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
README.md
CHANGED
|
@@ -1,33 +1,21 @@
|
|
| 1 |
---
|
| 2 |
pipeline_tag: robotics
|
| 3 |
tags:
|
| 4 |
-
-
|
| 5 |
library_name: lerobot
|
| 6 |
-
datasets:
|
| 7 |
-
- lerobot/svla_so101_pickplace
|
| 8 |
---
|
| 9 |
|
| 10 |
-
|
| 11 |
|
| 12 |
-
|
| 13 |
|
| 14 |
-
[
|
| 15 |
-
|
| 16 |
-
[SmolVLA Blogpost](https://huggingface.co/blog/smolvla)
|
| 17 |
-
|
| 18 |
-
[Code](https://github.com/huggingface/lerobot/blob/main/lerobot/common/policies/smolvla/modeling_smolvla.py)
|
| 19 |
-
|
| 20 |
-
[Train using Google Colab Notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/lerobot/training-smolvla.ipynb#scrollTo=ZO52lcQtxseE)
|
| 21 |
-
|
| 22 |
-
[SmolVLA HF Documentation](https://huggingface.co/docs/lerobot/smolvla)
|
| 23 |
|
| 24 |
Designed by Hugging Face.
|
| 25 |
|
| 26 |
This model has 450M parameters in total.
|
| 27 |
You can use inside the [LeRobot library](https://github.com/huggingface/lerobot).
|
| 28 |
|
| 29 |
-
Before proceeding to the next steps, you need to properly install the environment by following [Installation Guide](https://huggingface.co/docs/lerobot/installation) on the docs.
|
| 30 |
-
|
| 31 |
Install smolvla extra dependencies:
|
| 32 |
```bash
|
| 33 |
pip install -e ".[smolvla]"
|
|
@@ -36,25 +24,23 @@ pip install -e ".[smolvla]"
|
|
| 36 |
Example of finetuning the smolvla pretrained model (`smolvla_base`):
|
| 37 |
```bash
|
| 38 |
python lerobot/scripts/train.py \
|
| 39 |
-
|
| 40 |
-
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
--output_dir=outputs/train/my_smolvla \
|
| 44 |
-
--job_name=my_smolvla_training \
|
| 45 |
-
--policy.device=cuda \
|
| 46 |
-
--wandb.enable=true
|
| 47 |
```
|
| 48 |
|
| 49 |
Example of finetuning the smolvla neural network with pretrained VLM and action expert
|
| 50 |
intialized from scratch:
|
| 51 |
```bash
|
| 52 |
python lerobot/scripts/train.py \
|
| 53 |
-
|
| 54 |
-
|
| 55 |
-
|
| 56 |
-
|
| 57 |
-
|
| 58 |
-
|
| 59 |
-
|
|
|
|
|
|
|
| 60 |
```
|
|
|
|
| 1 |
---
|
| 2 |
pipeline_tag: robotics
|
| 3 |
tags:
|
| 4 |
+
- lerobot
|
| 5 |
library_name: lerobot
|
|
|
|
|
|
|
| 6 |
---
|
| 7 |
|
| 8 |
+
SmolVLA: A vision-language-action model for affordable and efficient robotics
|
| 9 |
|
| 10 |
+
[Paper](https://huggingface.co/papers/2506.01844)
|
| 11 |
|
| 12 |
+
[Code](https://github.com/huggingface/lerobot)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
|
| 14 |
Designed by Hugging Face.
|
| 15 |
|
| 16 |
This model has 450M parameters in total.
|
| 17 |
You can use inside the [LeRobot library](https://github.com/huggingface/lerobot).
|
| 18 |
|
|
|
|
|
|
|
| 19 |
Install smolvla extra dependencies:
|
| 20 |
```bash
|
| 21 |
pip install -e ".[smolvla]"
|
|
|
|
| 24 |
Example of finetuning the smolvla pretrained model (`smolvla_base`):
|
| 25 |
```bash
|
| 26 |
python lerobot/scripts/train.py \
|
| 27 |
+
--policy.path=lerobot/smolvla_base \
|
| 28 |
+
--dataset.repo_id=danaaubakirova/svla_so100_task1_v3 \
|
| 29 |
+
--batch_size=64 \
|
| 30 |
+
--steps=200000
|
|
|
|
|
|
|
|
|
|
|
|
|
| 31 |
```
|
| 32 |
|
| 33 |
Example of finetuning the smolvla neural network with pretrained VLM and action expert
|
| 34 |
intialized from scratch:
|
| 35 |
```bash
|
| 36 |
python lerobot/scripts/train.py \
|
| 37 |
+
--policy.type=smolvla \
|
| 38 |
+
--dataset.repo_id=danaaubakirova/svla_so100_task1_v3 \
|
| 39 |
+
--batch_size=64 \
|
| 40 |
+
--steps=200000
|
| 41 |
+
```
|
| 42 |
+
|
| 43 |
+
Example of using the smolvla pretrained model outside LeRobot training framework:
|
| 44 |
+
```python
|
| 45 |
+
policy = SmolVLAPolicy.from_pretrained("lerobot/smolvla_base")
|
| 46 |
```
|
collage_small.gif
DELETED
Git LFS Details
|
config.json
CHANGED
|
@@ -1,6 +1,11 @@
|
|
| 1 |
{
|
| 2 |
"type": "smolvla",
|
| 3 |
"n_obs_steps": 1,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
"input_features": {
|
| 5 |
"observation.state": {
|
| 6 |
"type": "STATE",
|
|
@@ -8,7 +13,7 @@
|
|
| 8 |
6
|
| 9 |
]
|
| 10 |
},
|
| 11 |
-
"observation.
|
| 12 |
"type": "VISUAL",
|
| 13 |
"shape": [
|
| 14 |
3,
|
|
@@ -16,7 +21,7 @@
|
|
| 16 |
256
|
| 17 |
]
|
| 18 |
},
|
| 19 |
-
"observation.
|
| 20 |
"type": "VISUAL",
|
| 21 |
"shape": [
|
| 22 |
3,
|
|
@@ -24,7 +29,7 @@
|
|
| 24 |
256
|
| 25 |
]
|
| 26 |
},
|
| 27 |
-
"observation.
|
| 28 |
"type": "VISUAL",
|
| 29 |
"shape": [
|
| 30 |
3,
|
|
@@ -41,20 +46,8 @@
|
|
| 41 |
]
|
| 42 |
}
|
| 43 |
},
|
| 44 |
-
"device": "cuda",
|
| 45 |
-
"use_amp": false,
|
| 46 |
-
"push_to_hub": true,
|
| 47 |
-
"repo_id": null,
|
| 48 |
-
"private": null,
|
| 49 |
-
"tags": null,
|
| 50 |
-
"license": null,
|
| 51 |
"chunk_size": 50,
|
| 52 |
-
"n_action_steps":
|
| 53 |
-
"normalization_mapping": {
|
| 54 |
-
"VISUAL": "IDENTITY",
|
| 55 |
-
"STATE": "MEAN_STD",
|
| 56 |
-
"ACTION": "MEAN_STD"
|
| 57 |
-
},
|
| 58 |
"max_state_dim": 32,
|
| 59 |
"max_action_dim": 32,
|
| 60 |
"resize_imgs_with_padding": [
|
|
@@ -83,14 +76,11 @@
|
|
| 83 |
"scheduler_decay_lr": 2.5e-06,
|
| 84 |
"vlm_model_name": "HuggingFaceTB/SmolVLM2-500M-Video-Instruct",
|
| 85 |
"load_vlm_weights": true,
|
| 86 |
-
"add_image_special_tokens": false,
|
| 87 |
"attention_mode": "cross_attn",
|
| 88 |
"prefix_length": 0,
|
| 89 |
"pad_language_to": "max_length",
|
| 90 |
"num_expert_layers": 0,
|
| 91 |
"num_vlm_layers": 16,
|
| 92 |
"self_attn_every_n_layers": 2,
|
| 93 |
-
"expert_width_multiplier": 0.75
|
| 94 |
-
|
| 95 |
-
"max_period": 4.0
|
| 96 |
-
}
|
|
|
|
| 1 |
{
|
| 2 |
"type": "smolvla",
|
| 3 |
"n_obs_steps": 1,
|
| 4 |
+
"normalization_mapping": {
|
| 5 |
+
"VISUAL": "IDENTITY",
|
| 6 |
+
"STATE": "MEAN_STD",
|
| 7 |
+
"ACTION": "MEAN_STD"
|
| 8 |
+
},
|
| 9 |
"input_features": {
|
| 10 |
"observation.state": {
|
| 11 |
"type": "STATE",
|
|
|
|
| 13 |
6
|
| 14 |
]
|
| 15 |
},
|
| 16 |
+
"observation.image2": {
|
| 17 |
"type": "VISUAL",
|
| 18 |
"shape": [
|
| 19 |
3,
|
|
|
|
| 21 |
256
|
| 22 |
]
|
| 23 |
},
|
| 24 |
+
"observation.image": {
|
| 25 |
"type": "VISUAL",
|
| 26 |
"shape": [
|
| 27 |
3,
|
|
|
|
| 29 |
256
|
| 30 |
]
|
| 31 |
},
|
| 32 |
+
"observation.image3": {
|
| 33 |
"type": "VISUAL",
|
| 34 |
"shape": [
|
| 35 |
3,
|
|
|
|
| 46 |
]
|
| 47 |
}
|
| 48 |
},
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 49 |
"chunk_size": 50,
|
| 50 |
+
"n_action_steps": 1,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 51 |
"max_state_dim": 32,
|
| 52 |
"max_action_dim": 32,
|
| 53 |
"resize_imgs_with_padding": [
|
|
|
|
| 76 |
"scheduler_decay_lr": 2.5e-06,
|
| 77 |
"vlm_model_name": "HuggingFaceTB/SmolVLM2-500M-Video-Instruct",
|
| 78 |
"load_vlm_weights": true,
|
|
|
|
| 79 |
"attention_mode": "cross_attn",
|
| 80 |
"prefix_length": 0,
|
| 81 |
"pad_language_to": "max_length",
|
| 82 |
"num_expert_layers": 0,
|
| 83 |
"num_vlm_layers": 16,
|
| 84 |
"self_attn_every_n_layers": 2,
|
| 85 |
+
"expert_width_multiplier": 0.75
|
| 86 |
+
}
|
|
|
|
|
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8f8dc071d5b933e79edd2b73b8d6b5cca482ef0437c099ea3ec13ab978a38fc8
|
| 3 |
+
size 906720008
|
policy_postprocessor.json
DELETED
|
@@ -1,32 +0,0 @@
|
|
| 1 |
-
{
|
| 2 |
-
"name": "policy_postprocessor",
|
| 3 |
-
"steps": [
|
| 4 |
-
{
|
| 5 |
-
"registry_name": "unnormalizer_processor",
|
| 6 |
-
"config": {
|
| 7 |
-
"eps": 1e-08,
|
| 8 |
-
"features": {
|
| 9 |
-
"action": {
|
| 10 |
-
"type": "ACTION",
|
| 11 |
-
"shape": [
|
| 12 |
-
6
|
| 13 |
-
]
|
| 14 |
-
}
|
| 15 |
-
},
|
| 16 |
-
"norm_map": {
|
| 17 |
-
"VISUAL": "IDENTITY",
|
| 18 |
-
"STATE": "MEAN_STD",
|
| 19 |
-
"ACTION": "MEAN_STD"
|
| 20 |
-
}
|
| 21 |
-
},
|
| 22 |
-
"state_file": "policy_postprocessor_step_0_unnormalizer_processor.safetensors"
|
| 23 |
-
},
|
| 24 |
-
{
|
| 25 |
-
"registry_name": "device_processor",
|
| 26 |
-
"config": {
|
| 27 |
-
"device": "cpu",
|
| 28 |
-
"float_dtype": null
|
| 29 |
-
}
|
| 30 |
-
}
|
| 31 |
-
]
|
| 32 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
policy_postprocessor_step_0_unnormalizer_processor.safetensors
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:490ab239d96e263687c0b2e386a0afbc235a2eceb9857c36ed32f2f162a3e7c8
|
| 3 |
-
size 640
|
|
|
|
|
|
|
|
|
|
|
|
policy_preprocessor.json
DELETED
|
@@ -1,87 +0,0 @@
|
|
| 1 |
-
{
|
| 2 |
-
"name": "policy_preprocessor",
|
| 3 |
-
"steps": [
|
| 4 |
-
{
|
| 5 |
-
"registry_name": "rename_observations_processor",
|
| 6 |
-
"config": {
|
| 7 |
-
"rename_map": {}
|
| 8 |
-
}
|
| 9 |
-
},
|
| 10 |
-
{
|
| 11 |
-
"registry_name": "to_batch_processor",
|
| 12 |
-
"config": {}
|
| 13 |
-
},
|
| 14 |
-
{
|
| 15 |
-
"registry_name": "smolvla_new_line_processor",
|
| 16 |
-
"config": {}
|
| 17 |
-
},
|
| 18 |
-
{
|
| 19 |
-
"registry_name": "tokenizer_processor",
|
| 20 |
-
"config": {
|
| 21 |
-
"max_length": 48,
|
| 22 |
-
"task_key": "task",
|
| 23 |
-
"padding_side": "right",
|
| 24 |
-
"padding": "max_length",
|
| 25 |
-
"truncation": true,
|
| 26 |
-
"tokenizer_name": "HuggingFaceTB/SmolVLM2-500M-Video-Instruct"
|
| 27 |
-
}
|
| 28 |
-
},
|
| 29 |
-
{
|
| 30 |
-
"registry_name": "device_processor",
|
| 31 |
-
"config": {
|
| 32 |
-
"device": "cuda",
|
| 33 |
-
"float_dtype": null
|
| 34 |
-
}
|
| 35 |
-
},
|
| 36 |
-
{
|
| 37 |
-
"registry_name": "normalizer_processor",
|
| 38 |
-
"config": {
|
| 39 |
-
"eps": 1e-08,
|
| 40 |
-
"features": {
|
| 41 |
-
"observation.state": {
|
| 42 |
-
"type": "STATE",
|
| 43 |
-
"shape": [
|
| 44 |
-
6
|
| 45 |
-
]
|
| 46 |
-
},
|
| 47 |
-
"observation.image2": {
|
| 48 |
-
"type": "VISUAL",
|
| 49 |
-
"shape": [
|
| 50 |
-
3,
|
| 51 |
-
256,
|
| 52 |
-
256
|
| 53 |
-
]
|
| 54 |
-
},
|
| 55 |
-
"observation.image": {
|
| 56 |
-
"type": "VISUAL",
|
| 57 |
-
"shape": [
|
| 58 |
-
3,
|
| 59 |
-
256,
|
| 60 |
-
256
|
| 61 |
-
]
|
| 62 |
-
},
|
| 63 |
-
"observation.image3": {
|
| 64 |
-
"type": "VISUAL",
|
| 65 |
-
"shape": [
|
| 66 |
-
3,
|
| 67 |
-
256,
|
| 68 |
-
256
|
| 69 |
-
]
|
| 70 |
-
},
|
| 71 |
-
"action": {
|
| 72 |
-
"type": "ACTION",
|
| 73 |
-
"shape": [
|
| 74 |
-
6
|
| 75 |
-
]
|
| 76 |
-
}
|
| 77 |
-
},
|
| 78 |
-
"norm_map": {
|
| 79 |
-
"VISUAL": "IDENTITY",
|
| 80 |
-
"STATE": "MEAN_STD",
|
| 81 |
-
"ACTION": "MEAN_STD"
|
| 82 |
-
}
|
| 83 |
-
},
|
| 84 |
-
"state_file": "policy_preprocessor_step_5_normalizer_processor.safetensors"
|
| 85 |
-
}
|
| 86 |
-
]
|
| 87 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
policy_preprocessor_step_5_normalizer_processor.safetensors
DELETED
|
@@ -1,3 +0,0 @@
|
|
| 1 |
-
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:490ab239d96e263687c0b2e386a0afbc235a2eceb9857c36ed32f2f162a3e7c8
|
| 3 |
-
size 640
|
|
|
|
|
|
|
|
|
|
|
|
train_config.json
ADDED
|
@@ -0,0 +1,196 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"dataset": {
|
| 3 |
+
"repo_id": "satvikahuja/mixer_on_off_new_1,aergogo/so100_pick_place,andy309/so100_0314_fold_cloths,jchun/so100_pickplace_small_20250323_120056,astroyat/cube,Ofiroz91/so_100_cube2bowl,HappyPablo/dec3_data2,ZCM5115/so100_1210,francescocrivelli/orange_feeding,francescocrivelli/carrot_eating,0x00raghu/toffee_red,0x00raghu/toffee_red_2,0x00raghu/toffee_red_3__,0x00raghu/toffee_blue,0x00raghu/toffee_blue_2,0x00raghu/toffee_to_hand_1,0x00raghu/toffee_to_hand_2,liyitenga/so100_bi_hello,liyitenga/so100_bi_giveme5,ZCM5115/so100_2Arm3cameras_movebox,pranavsaroha/so100_carrot_1,pranavsaroha/so100_carrot_3,pranavsaroha/so100_carrot_4,maximilienroberti/so100_lego_red_box,pranavsaroha/so100_squishy,rabhishek100/so100_train_dataset,pranavsaroha/so100_squishy100,swarajgosavi/kikobot_pusht_real_v2,pandaRQ/pickmed,swarajgosavi/act_kikobot_pusht_real,pranavsaroha/so100_squishy2colors,pranavsaroha/so100_squishy2colors_1,Chojins/chess_game_001_white,jmrog/so100_sweet_pick,Chojins/chess_game_002_white,pranavsaroha/so100_squishy2colors_2_new,Chojins/chess_game_003_white,aractingi/pick_place_lego_cube,Chojins/chess_game_004_white,Chojins/chess_game_005_white,Chojins/chess_game_006_white,Chojins/chess_game_007_white,koenvanwijk/blue2,jlitch/so100multicam3,koenvanwijk/blue52,jlitch/so100multicam6,aractingi/pick_place_lego_cube_1,jlitch/so100multicam7,vladfatu/so100_ds,Chojins/chess_game_000_white,HITHY/so100-kiwi,HITHY/so100_peach1,HITHY/so100_redstrawberry,satvikahuja/orange_mixer_1,satvikahuja/mixer_on_off,satvikahuja/orange_pick_place_new1,satvikahuja/mixer_on_off_new,danmac1/real_real332,FeiYjf/Makalu_push,liyitenga/so100_pick_taffy1,chmadran/so100_dataset04,FeiYjf/Maklu_dataset,FeiYjf/new_Dataset,liyitenga/so100_pick_taffy2,satvikahuja/mixer_on_off_new_4,CSCSXX/pick_place_cube_1.17,liyitenga/so100_pick_taffy3,liyitenga/so100_pick_taffy4,yuz1wan/so100_pick_pink,yuz1wan/so100_pick_wahaha,yuz1wan/so100_pp_pink,yuz1wan/so100_pour_cup,liyitenga/so100_pick_taffy5,liyitenga/so100_pick_taffy6,yuz1wan/so100_button,yuz1wan/so100_pickplace,liyitenga/so100_pick_taffy7,FeiYjf/push_gg,FeiYjf/push_0094,swarajgosavi/act_kikobot_block_real,liyitenga/so100_pick_taffy8,phospho-ai/OrangeBrick3Cameras,vaishanthr/toy_pick_place,SeanLMH/so100_picknplace_v2,pepijn223/yellow_lego_in_box1,DimiSch/so100_50ep_2,DimiSch/so100_50ep_3,SeanLMH/so100_picknplace,nbaron99/so100_pick_and_place2,chmadran/so100_dataset08,vaishanthr/toy_pickplace_50ep,Beegbrain/pick_place_green_block_lr,Ityl/so100_recording1,vaishanthr/toy_pickplace,ad330/so100_box_pickPlace,Beegbrain/so100_put_cube_cup,aractingi/push_green_cube_hf,aractingi/push_green_cube_hf_cropped_resized,carpit680/giraffe_task,carpit680/giraffe_sock_demo_1,DimiSch/so100_terra_50_2,carpit680/giraffe_sock_demo_2,aractingi/push_cube_to_face_reward,aractingi/push_cube_to_face_reward_cropped_resized,aractingi/push_cube_reward_data,aractingi/push_cube_reward_data_cropped_resized,aractingi/push_cube_offline_data_cropped_resized,aractingi/push_cube_front_side_reward,aractingi/push_cube_front_side_reward_cropped_resized,aractingi/push_cube_front_side_reward_long,aractingi/push_cube_front_side_reward_long_cropped_resized,aractingi/push_cube_reward,aractingi/push_cube_reward_cropped_resized,aractingi/push_cube_square_reward_cropped_resized,aractingi/push_cube_square_reward_1,aractingi/push_cube_square_reward_1_cropped_resized,aractingi/push_cube_square_light_reward,aractingi/push_cube_square_light_offline_demo,aractingi/push_cube_square_light_offline_demo_cropped_resized,denghj/dataset_red_tape01,aractingi/push_cube_square_offline_demo,aractingi/push_cube_square_offline_demo_cropped_resized,Beegbrain/stack_two_cubes,FeiYjf/Test_NNNN,LegrandFrederic/Orange-brick-lower-resolution,aractingi/pick_place_lego_cube_cropped_resized,aractingi/push_cube_overfit,aractingi/push_cube_overfit_cropped_resized,HITHY/so100_peach,zaringleb/so100_cube_2,andreasBihlmaier/dual_arm_transfer_2025_02_16,zaringleb/so100_cube_4_binary,1g0rrr/reward_pickplace1,1g0rrr/reward_pickplace1_cropped_resized,FeiYjf/Hold_Pieces,FeiYjf/Grab_Pieces,hegdearyandev/so100_eraser_cup_v1,jbraumann/so100_1902,liyitenga/so100_pick_taffy10,mikechambers/block_cup_5,zaringleb/so100_cube_5_linear,yuz1wan/so100_pickplace_0223_2,yuz1wan/so100_pickplace_0223_3,samsam0510/mj_data_temp,samsam0510/tape_insert_1,samsam0510/tape_insert_2,pengjunkun/so100_push_to_hole,Deason11/Random_Kitchen,1g0rrr/reward_dataset_name2,1g0rrr/reward_dataset_name2_cropped_resized,1g0rrr/offline_dataset_name2,1g0rrr/offline_dataset_name2_cropped_resized,aractingi/push_cube_simp_cropped_resized,danielkr452/so100_work6,Loki0929/so100_100,yuz1wan/so100_fold_0227_1,yuz1wan/so100_fold_0227_2,speedyyoshi/so100_grasp_pink_block,lirislab/stack_two_red_cubes,lirislab/red_cube_into_mug,lirislab/green_lego_block_into_mug,lirislab/green_lego_block_into_mug_easy,kevin510/lerobot-cat-toy-placement,NONHUMAN-RESEARCH/SOARM100_TASK_VENDA_BOX,wangjl1512/pour_water,airthebear/so100_GL,zijian2022/noticehuman1,zijian2022/noticehuman2,kantine/so100_kapla_tower6,zijian2022/noticehuman5,zijian2022/llm40,Ashton3/lerobot-aloha,zijian2022/noticehuman50,AaronNewman/screwdriver_task_batch1,AaronNewman/screwdriver_task_batch2,AaronNewman/screwdriver_task_batch3,zijian2022/noticehuman60,zijian2022/noticehuman70,Bartm3/tape_to_bin,liuhuanjim013/so100_th_1,Pi-robot/barbecue_flip,Pi-robot/barbecue_put,wangjl1512/doll,sshh11/so100_orange_50ep_1,sshh11/so100_orange_50ep_2,DorayakiLin/so100_pick_cube_in_box,Bartm3/tape_to_bin2,luke250305/play_dice_250311.1,andy309/so100_0311_1152,sihyun77/suho_so100,sihyun77/si_so100,shreyasgite/so100_base_left,sihyun77/suho_red,liuhuanjim013/so100_block,andy309/so100_0313_no_wrist_camera,zijian2022/l9,zijian2022/n1_2,DorayakiLin/so100_stack_cube,andy309/so100_0313_no_wrist_camera_with_two_arms_cloths,joaoocruz00/so100_makeitD1,zijian2022/l10_1,zijian2022/l10_5,sihyun77/suho_red2,sihyun77/suho_angel,sihyun77/sihyun_king,acrampette/third_arm_01,Winster/so100_cube,1g0rrr/sam_openpi03,thedevansh/mar16_1336,hkphoooey/throw_stuffie,doujiangwang/task1_10epi_100000step,sihyun77/sihyun_3_17_1,acrampette/third_arm_02,imsyed00/so100_yellowbowl_pickplace_1,kumarhans/so100_tape_task,sihyun77/sihyun_main,doujiangwang/task2_10epi_100000step,kantine/industrial_robothon_buttons_expert,kantine/industrial_robothon_buttons_anomaly,kantine/industrial_robothon_hatchAndProbe_expert,kantine/industrial_robothon_hatchAndProbe_anomaly,Odog16/so100_tea_towel_folding_v1,zijian2022/so100_318,zijian2022/so100_318_1,Congying1112/so100_place_blue_bottle_with_two_cameras,Congying1112/so100_place_blue_bottle_with_two_cameras2,Congying1112/so100_place_blue_bottle_with_single_camera,pietroom/first_task_short,kantine/industrial_screws_sorting_expert,kantine/industrial_screws_sorting_anomaly,pietroom/second_task,zijian2022/c0,doujiangwang/task4_10epi_100000step,Congying1112/so100_switch_with_onhand_camera,HYAIYN/so100_get_orange_10epi,doujiangwang/task5_10epi_100000step,1g0rrr/sam_openpi_cube_low10,1g0rrr/sam_openpi_cube_top10,1g0rrr/sam_openpi_wire10,1g0rrr/sam_openpi_solder1,1g0rrr/sam_openpi_solder2,wcode/so100_put_pen_50,jchun/so100_pickplace_small_20250322_193929,bnarin/so100_tic_tac_toe_we_do_it_live,dc2ac/so100-t5,chmadran/so100_home_dataset,baladhurgesh97/so100_final_picking_3,bnarin/so100_tic_tac_toe_move_0_0,bnarin/so100_tic_tac_toe_move_1_0,bnarin/so100_tic_tac_toe_move_2_1,bnarin/so100_tic_tac_toe_move_4_0,zaringleb/so100_cube_6_2d,andlyu/so100_indoor_0,andlyu/so100_indoor_2,Winster/so100_sim,badwolf256/so100_twin_cam_duck,Congying1112/so100_simplepick_with_2_cameras_from_top,andlyu/so100_indoor_4,Zak-Y/so100_grap_dataset,kantine/domotic_pouringCoffee_expert,kantine/domotic_pouringCoffee_anomaly,lucasngoo/so100_strawberry_grape,kantine/domotic_makingCoffee_expert,kantine/domotic_makingCoffee_anomaly,ZGGZZG/so100_drop1,kantine/industrial_soldering_expert,kantine/industrial_soldering_anomaly,Yotofu/so100_sweeper_shoes,kantine/domotic_dishTidyUp_expert,kantine/domotic_dishTidyUp_anomaly,kantine/domotic_groceriesSorting_expert,kantine/domotic_groceriesSorting_anomaly,badwolf256/so100_twin_cam_duck_v2,kantine/domotic_vegetagblesAndFruitsSorting_expert,kantine/domotic_vegetagblesAndFruitsSorting_anomaly,kantine/domotic_setTheTable_expert,kantine/domotic_setTheTable_anomaly,therarelab/so100_pick_place,abhisb/so100_51_ep,andlyu/so100_indoor_val_0,allenchienxxx/so100Test,lizi178119985/so100_jia,badwolf256/so100_twin_cam_duck_v3,andrewcole712/so100_tape_bin_place,Gano007/so100_lolo,Zak-Y/so100_three_cameras_dataset,Gano007/so100_doliprane,XXRRSSRR/so100_v3_num_episodes_50,zijian2022/assemblyarm2,ganker5/so100_action_20250403,andlyu/so100_indoor_val2,Gano007/so100_gano,paszea/so100_whale_grab,paszea/so100_whale,Clementppr/lerobot_pick_and_place_dataset_world_model,andlyu/so100_indoor_10,RasmusP/so100_dataset50ep_a,RasmusP/so100_dataset50ep,Gano007/so100_second,zaringleb/so100_cude_linear_and_2d_comb,dsfsg/grasp_pens,zijian2022/digitalfix,zijian2022/digitalfix2,zijian2022/digitalfix3,T1g3rGE/so100_pickplace_small_20250407_171912,sihyun77/mond_13,abokinala/sputnik_100_11_pick_place_container,dsfsg/bring_bottle,abokinala/sputnik_100_12_pick_place_container,Mwuqiu/so100_0408,AK51/4090_01,356c/so100_rope_reposition_1,paszea/so100_lego_mix,abokinala/sputnik_100_14_pick_place_container,abokinala/sputnik_100_23_pick_place_surface,jiajun001/eraser00_2,jlesein/TestBoulon2,duthvik/sputnik_100_31_pour_liquid,duthvik/sputnik_100_24_pick_place_surface,duthvik/sputnik_100_25_pick_place_surface,duthvik/sputnik_100_17_pick_place_container,duthvik/sputnik_100_26_pick_place_surface,VoicAndrei/so100_banana_to_plate_rebel_full,isadev/bougies1,danaaubakirova/so100_task_1,danaaubakirova/so100_task_2,danaaubakirova/so100_task_3,danaaubakirova/so100_task_4,sixpigs1/so100_pick_cube_in_box_error,sixpigs1/so100_push_cube_error,sixpigs1/so100_pull_cube_error,isadev/bougies2,therarelab/med_dis_rare_6,duthvik/sputnik_100_27_pick_place_surface,zijian2022/closer3,duthvik/sputnik_100_41_custom_tasks,duthvik/sputnik_100_42_custom_tasks,duthvik/sputnik_100_43_custom_tasks,duthvik/sputnik_100_44_custom_tasks,duthvik/sputnik_100_51_kitchen_tasks,duthvik/sputnik_100_52_kitchen_tasks,duthvik/sputnik_100_53_kitchen_tasks,duthvik/sputnik_100_45_custom_tasks,duthvik/sputnik_100_32_pour_liquid,duthvik/sputnik_100_29_pick_place_surface,duthvik/sputnik_100_18_pick_place_container,sixpigs1/so100_pull_cube_by_tool_error,sixpigs1/so100_insert_cylinder_error,abokinala/sputnik_100_54_kitchen_tasks,abokinala/sputnik_100_55_kitchen_tasks,m1b/so100_bluelego,abokinala/sputnik_100_46_custom_tasks,m1b/so100_bluelego_updt,kantine/flip_A0,kantine/flip_A1,kantine/flip_A2,kantine/flip_A3,lirislab/guess_who_no_cond,kantine/flip_A4,kantine/flip_A5,lirislab/guess_who_lighting,nguyen-v/so100_press_red_button,nguyen-v/so100_bimanual_grab_lemon_put_in_box2,pierfabre/cow,nguyen-v/press_red_button_new,nguyen-v/so100_rotate_red_button,Cidoyi/so100_all_notes,roboticshack/team10-red-block,Cidoyi/so100_all_notes_1,roboticshack/team_5-QuiEstCe_everyBox,roboticshack/team11_pianobot,roboticshack/team2-guess_who_so100,roboticshack/team2-guess_who_so100_light,roboticshack/team2-guess_who_so100_edge_case,roboticshack/team2-guess_who_less_ligth,Cidoyi/so100_all_notes_3,dsfsg/grasp_pen_and_bottle,abokinala/sputnik_100_60_kitchen_tasks,abokinala/sputnik_100_58_kitchen_tasks,danaaubakirova/so100_v2_task_1,danaaubakirova/so100_v2_task_2,danaaubakirova/so100_v2_task_3,danaaubakirova/so100_v2_task_4,zijian2022/force1,zijian2022/force2,zijian2022/force3,jiajun001/eraser00_3,zijian2022/bi2,zijian2022/bi1,zijian2022/hand1,Setchii/so100_grab_ball,MossProphet/so100_square-1-2-3.2,pierfabre/rabbit,bensprenger/right_arm_p_brick_in_box_with_y_noise_v0,pierfabre/horse,pierfabre/pig2,pierfabre/pig3,pierfabre/cow2,pierfabre/sheep,Chojins/chess_game_009_white,sihyun77/suho_3_17_1,sihyun77/sihyun_3_17_2,sihyun77/suho_3_17_3,sihyun77/sihyun_3_17_5,Odog16/so100_cube_drop_pick_v1,sihyun77/sihyun_main_2,sihyun77/suho_main_2,Bartm3/dice2,sihyun77/sihyun_main_3,Loki0929/so100_duck,pietroom/holdthis,pietroom/actualeasytask,Beegbrain/pick_lemon_and_drop_in_bowl,Beegbrain/sweep_tissue_cube,zijian2022/321,gxy1111/so100_pick_place,Odog16/so100_cube_stacking_v1,sihyun77/mond_1,andlyu/so100_indoor_1,andlyu/so100_indoor_3,frk2/so100large,lirislab/sweep_tissue_cube,lirislab/lemon_into_bowl,lirislab/red_cube_into_green_lego_block,lirislab/red_cube_into_blue_cube,00ri/so100_battery,frk2/so100largediffcam,FsqZ/so100_1,ZGGZZG/so100_drop0,Chojins/chess_game_000_white_red,smanni/train_so100_fluffy_box,ganker5/so100_push_20250328,ganker5/so100_dataline_0328,ganker5/so100_color_0328,CrazyYhang/A1234-B-C_mvA2B,RasmusP/so100_Orange2Green,sixpigs1/so100_pick_cube_in_box,ganker5/so100_push_20250331,ganker5/so100_dataline_20250331,lirislab/put_caps_into_teabox,lirislab/close_top_drawer_teabox,lirislab/open_top_drawer_teabox,lirislab/unfold_bottom_right,lirislab/push_cup_target,lirislab/put_banana_bowl,Chojins/chess_game_001_blue_stereo,Chojins/chess_game_001_red_stereo,ganker5/so100_toy_20250402,Gano007/so100_medic,00ri/so100_battery_bin_center,paszea/so100_whale_2,lirislab/fold_bottom_right,lirislab/put_coffee_cap_teabox,therarelab/so100_pick_place_2,paszea/so100_whale_3,paszea/so100_whale_4,paszea/so100_lego,LemonadeDai/so100_coca,zijian2022/backgrounda,zijian2022/backgroundb,356c/so100_nut_sort_1,Mwuqiu/so100_0408_muti,aimihat/so100_tape,lirislab/so100_demo,356c/so100_duck_reposition_1,zijian2022/sort1,weiye11/so100_410_zwy,VoicAndrei/so100_banana_to_plate_only,sixpigs1/so100_stack_cube_error,isadev/bougies3,zijian2022/close3,bensprenger/left_arm_yellow_brick_in_box_v0,lirislab/guess_who_so100,bensprenger/left_arm_yellow_brick_in_box_with_purple_noise_v0,roboticshack/team16-can-stacking,zijian2022/insert2,roboticshack/team-7-right-arm-grasp-tape,Jiangeng/so100_413,roboticshack/team9-pick_cube_place_static_plate,AndrejOrsula/lerobot_double_ball_stacking_random,roboticshack/left-arm-grasp-lego-brick,roboticshack/team-7-left-arm-grasp-motor,roboticshack/team9-pick_chicken_place_plate,roboticshack/team13-two-balls-stacking,tkc79/so100_lego_box_1,roboticshack/team13-three-balls-stacking,pierfabre/chicken,roboticshack/team16-water-pouring,ad330/cubePlace,Jiafei1224/so100_pa222per,paszea/so100_lego_2cam,bensprenger/chess_game_001_blue_stereo,Mohamedal/put_banana,tkc79/so100_lego_box_2,samanthalhy/so100_herding_1,jlesein/TestBoulon7,pranavsaroha/so100_onelego2,pranavsaroha/so100_onelego3,pranavsaroha/so100_carrot_2,vladfatu/so100_above,koenvanwijk/orange50-1,CSCSXX/pick_place_cube_1.18,dragon-95/so100_sorting,dragon-95/so100_sorting_1,nbaron99/so100_pick_and_place4,Beegbrain/pick_place_green_block,dragon-95/so100_sorting_3,HITHY/so100_peach3,shreyasgite/so100_legocube_50,triton7777/so100_dataset_mix,NONHUMAN-RESEARCH/SOARM100_TASK_VENDA,mikechambers/block_cup_14,samsam0510/tooth_extraction_3,samsam0510/tooth_extraction_4,samsam0510/cube_reorientation_2,samsam0510/cube_reorientation_4,samsam0510/glove_reorientation_1,vladfatu/so100_office,pranavsaroha/so100_legos4,Ityl/so100_recording2,FeiYjf/new_GtoR,dragon-95/so100_sorting_2,HITHY/so100_peach4,jpata/so100_pick_place_tangerine,HITHY/so100_strawberry,shreyasgite/so100_base_env,koenvanwijk/orange50-variation-2,pranavsaroha/so100_carrot_5,pandaRQ/pick_med_1,aractingi/push_cube_offline_data,DorayakiLin/so100_pick_charger_on_tissue,zijian2022/noticehuman3,liuhuanjim013/so100_th",
|
| 4 |
+
"episodes": null,
|
| 5 |
+
"image_transforms": {
|
| 6 |
+
"enable": true,
|
| 7 |
+
"max_num_transforms": 10,
|
| 8 |
+
"random_order": false,
|
| 9 |
+
"transform_version": 0,
|
| 10 |
+
"image_size": 256,
|
| 11 |
+
"tfs": {
|
| 12 |
+
"resize_with_pad": {
|
| 13 |
+
"weight": 1.0,
|
| 14 |
+
"type": "ResizeWithPad",
|
| 15 |
+
"kwargs": {
|
| 16 |
+
"size": [
|
| 17 |
+
256,
|
| 18 |
+
256
|
| 19 |
+
]
|
| 20 |
+
}
|
| 21 |
+
}
|
| 22 |
+
}
|
| 23 |
+
},
|
| 24 |
+
"local_files_only": true,
|
| 25 |
+
"use_imagenet_stats": false,
|
| 26 |
+
"video_backend": "pyav",
|
| 27 |
+
"sampling_weights": "",
|
| 28 |
+
"max_action_dim": 6,
|
| 29 |
+
"max_state_dim": 6,
|
| 30 |
+
"max_num_images": 3,
|
| 31 |
+
"max_image_dim": 256,
|
| 32 |
+
"train_on_all_features": true,
|
| 33 |
+
"features_version": 2,
|
| 34 |
+
"discard_first_n_frames": 0,
|
| 35 |
+
"min_fps": 30,
|
| 36 |
+
"max_fps": 30,
|
| 37 |
+
"discard_first_idle_frames": false,
|
| 38 |
+
"motion_threshold": 0.05,
|
| 39 |
+
"motion_window_size": 10,
|
| 40 |
+
"motion_buffer": 3
|
| 41 |
+
},
|
| 42 |
+
"env": null,
|
| 43 |
+
"policy": {
|
| 44 |
+
"type": "smolvla",
|
| 45 |
+
"n_obs_steps": 1,
|
| 46 |
+
"normalization_mapping": {
|
| 47 |
+
"VISUAL": "IDENTITY",
|
| 48 |
+
"STATE": "MEAN_STD",
|
| 49 |
+
"ACTION": "MEAN_STD"
|
| 50 |
+
},
|
| 51 |
+
"input_features": {
|
| 52 |
+
"observation.state": {
|
| 53 |
+
"type": "STATE",
|
| 54 |
+
"shape": [
|
| 55 |
+
6
|
| 56 |
+
]
|
| 57 |
+
},
|
| 58 |
+
"observation.image2": {
|
| 59 |
+
"type": "VISUAL",
|
| 60 |
+
"shape": [
|
| 61 |
+
3,
|
| 62 |
+
256,
|
| 63 |
+
256
|
| 64 |
+
]
|
| 65 |
+
},
|
| 66 |
+
"observation.image": {
|
| 67 |
+
"type": "VISUAL",
|
| 68 |
+
"shape": [
|
| 69 |
+
3,
|
| 70 |
+
256,
|
| 71 |
+
256
|
| 72 |
+
]
|
| 73 |
+
},
|
| 74 |
+
"observation.image3": {
|
| 75 |
+
"type": "VISUAL",
|
| 76 |
+
"shape": [
|
| 77 |
+
3,
|
| 78 |
+
256,
|
| 79 |
+
256
|
| 80 |
+
]
|
| 81 |
+
}
|
| 82 |
+
},
|
| 83 |
+
"output_features": {
|
| 84 |
+
"action": {
|
| 85 |
+
"type": "ACTION",
|
| 86 |
+
"shape": [
|
| 87 |
+
6
|
| 88 |
+
]
|
| 89 |
+
}
|
| 90 |
+
},
|
| 91 |
+
"chunk_size": 50,
|
| 92 |
+
"n_action_steps": 1,
|
| 93 |
+
"max_state_dim": 32,
|
| 94 |
+
"max_action_dim": 32,
|
| 95 |
+
"resize_imgs_with_padding": [
|
| 96 |
+
512,
|
| 97 |
+
512
|
| 98 |
+
],
|
| 99 |
+
"empty_cameras": 0,
|
| 100 |
+
"adapt_to_pi_aloha": false,
|
| 101 |
+
"use_delta_joint_actions_aloha": false,
|
| 102 |
+
"tokenizer_max_length": 48,
|
| 103 |
+
"num_steps": 10,
|
| 104 |
+
"use_cache": true,
|
| 105 |
+
"freeze_vision_encoder": true,
|
| 106 |
+
"train_expert_only": true,
|
| 107 |
+
"train_state_proj": true,
|
| 108 |
+
"optimizer_lr": 0.0001,
|
| 109 |
+
"optimizer_betas": [
|
| 110 |
+
0.9,
|
| 111 |
+
0.95
|
| 112 |
+
],
|
| 113 |
+
"optimizer_eps": 1e-08,
|
| 114 |
+
"optimizer_weight_decay": 1e-10,
|
| 115 |
+
"optimizer_grad_clip_norm": 10,
|
| 116 |
+
"scheduler_warmup_steps": 1000,
|
| 117 |
+
"scheduler_decay_steps": 30000,
|
| 118 |
+
"scheduler_decay_lr": 2.5e-06,
|
| 119 |
+
"vlm_model_name": "HuggingFaceTB/SmolVLM2-500M-Video-Instruct",
|
| 120 |
+
"load_vlm_weights": true,
|
| 121 |
+
"attention_mode": "cross_attn",
|
| 122 |
+
"prefix_length": 0,
|
| 123 |
+
"past_obs_keys": "image",
|
| 124 |
+
"pad_language_to": "max_length",
|
| 125 |
+
"num_expert_layers": 0,
|
| 126 |
+
"num_vlm_layers": 16,
|
| 127 |
+
"causal_action_attention_mask": true,
|
| 128 |
+
"self_attn_every_n_layers": 2,
|
| 129 |
+
"expert_width_multiplier": 0.75
|
| 130 |
+
},
|
| 131 |
+
"output_dir": "/lustre/fswork/projects/rech/dyf/ugz83ue/logs/lerobot/lerobot_so100_community_v1_v2_v3clean2_smolpi0_lr1e-4bs64steps400000gpus4freeze32_imgtoktrue_cross_attn_gap1_vlml16_causalacttrue_sa2_smolvlm2500_nobs1_expw0.75_feat2_lrvlm1e-4_trans0true_decaylr2.5e-630000_camfalse_fps3030_idlefalse",
|
| 132 |
+
"job_name": "smolvla",
|
| 133 |
+
"resume": false,
|
| 134 |
+
"overwrite": false,
|
| 135 |
+
"device": "cuda",
|
| 136 |
+
"use_amp": true,
|
| 137 |
+
"seed": 1000,
|
| 138 |
+
"num_workers": 4,
|
| 139 |
+
"batch_size": 64,
|
| 140 |
+
"eval_freq": 5000,
|
| 141 |
+
"log_freq": 200,
|
| 142 |
+
"save_checkpoint": true,
|
| 143 |
+
"save_freq": 20000,
|
| 144 |
+
"offline": {
|
| 145 |
+
"steps": 400000
|
| 146 |
+
},
|
| 147 |
+
"online": {
|
| 148 |
+
"steps": 0,
|
| 149 |
+
"rollout_n_episodes": 1,
|
| 150 |
+
"rollout_batch_size": 1,
|
| 151 |
+
"steps_between_rollouts": null,
|
| 152 |
+
"sampling_ratio": 0.5,
|
| 153 |
+
"env_seed": null,
|
| 154 |
+
"buffer_capacity": null,
|
| 155 |
+
"buffer_seed_size": 0,
|
| 156 |
+
"do_rollout_async": false
|
| 157 |
+
},
|
| 158 |
+
"use_policy_training_preset": true,
|
| 159 |
+
"optimizer": {
|
| 160 |
+
"type": "adamw",
|
| 161 |
+
"lr": 0.0001,
|
| 162 |
+
"weight_decay": 1e-10,
|
| 163 |
+
"grad_clip_norm": 10,
|
| 164 |
+
"betas": [
|
| 165 |
+
0.9,
|
| 166 |
+
0.95
|
| 167 |
+
],
|
| 168 |
+
"eps": 1e-08
|
| 169 |
+
},
|
| 170 |
+
"scheduler": {
|
| 171 |
+
"type": "cosine_decay_with_warmup",
|
| 172 |
+
"num_warmup_steps": 1000,
|
| 173 |
+
"num_decay_steps": 30000,
|
| 174 |
+
"peak_lr": 0.0001,
|
| 175 |
+
"decay_lr": 2.5e-06
|
| 176 |
+
},
|
| 177 |
+
"eval": {
|
| 178 |
+
"n_episodes": 50,
|
| 179 |
+
"batch_size": 50,
|
| 180 |
+
"use_async_envs": false
|
| 181 |
+
},
|
| 182 |
+
"wandb": {
|
| 183 |
+
"enable": false,
|
| 184 |
+
"disable_artifact": false,
|
| 185 |
+
"project": "lerobot",
|
| 186 |
+
"entity": null,
|
| 187 |
+
"notes": null
|
| 188 |
+
},
|
| 189 |
+
"nccl_timeout": 9000,
|
| 190 |
+
"gradient_accumulation_steps": 1,
|
| 191 |
+
"torch_compile": true,
|
| 192 |
+
"save_on_eval": false,
|
| 193 |
+
"dataloader_drop_last": true,
|
| 194 |
+
"eval_mse": true,
|
| 195 |
+
"eval_mse_steps": 1000
|
| 196 |
+
}
|