Ban

by YARGI - opened Jun 6

base: refs/heads/main

←

from: refs/pr/3

Discussion Files changed

+226

-397

This PR is in draft mode

Files changed (11) hide show

.gitattributes +0 -1
Finetune_SmolVLA_notebook.ipynb +0 -214
README.md +17 -31
collage_small.gif +0 -3
config.json +11 -21
model.safetensors +2 -2
policy_postprocessor.json +0 -32
policy_postprocessor_step_0_unnormalizer_processor.safetensors +0 -3
policy_preprocessor.json +0 -87
policy_preprocessor_step_5_normalizer_processor.safetensors +0 -3
train_config.json +196 -0

.gitattributes CHANGED Viewed

@@ -33,4 +33,3 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
-collage_small.gif filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

Finetune_SmolVLA_notebook.ipynb DELETED Viewed

@@ -1,214 +0,0 @@
-{
-  "cells": [
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "NQUk3Y0WwYZ4"
-      },
-      "source": [
-        "# 🤗 x 🦾: Training SmolVLA with LeRobot Notebook\n",
-        "\n",
-        "Welcome to the **LeRobot SmolVLA training notebook**! This notebook provides a ready-to-run setup for training imitation learning policies using the [🤗 LeRobot](https://github.com/huggingface/lerobot) library.\n",
-        "\n",
-        "In this example, we train an `SmolVLA` policy using a dataset hosted on the [Hugging Face Hub](https://huggingface.co/), and optionally track training metrics with [Weights & Biases (wandb)](https://wandb.ai/).\n",
-        "\n",
-        "## ⚙️ Requirements\n",
-        "- A Hugging Face dataset repo ID containing your training data (`--dataset.repo_id=YOUR_USERNAME/YOUR_DATASET`)\n",
-        "- Optional: A [wandb](https://wandb.ai/) account if you want to enable training visualization\n",
-        "- Recommended: GPU runtime (e.g., NVIDIA A100) for faster training\n",
-        "\n",
-        "## ⏱️ Expected Training Time\n",
-        "Training with the `SmolVLA` policy for 20,000 steps typically takes **about 5 hours on an NVIDIA A100** GPU. On less powerful GPUs or CPUs, training may take significantly longer!\n",
-        "\n",
-        "## Example Output\n",
-        "Model checkpoints, logs, and training plots will be saved to the specified `--output_dir`. If `wandb` is enabled, progress will also be visualized in your wandb project dashboard.\n"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "MOJyX0CnwA5m"
-      },
-      "source": [
-        "## Install conda\n",
-        "This cell uses `condacolab` to bootstrap a full Conda environment inside Google Colab.\n"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "id": "QlKjL1X5t_zM"
-      },
-      "outputs": [],
-      "source": [
-        "!pip install -q condacolab\n",
-        "import condacolab\n",
-        "condacolab.install()"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "DxCc3CARwUjN"
-      },
-      "source": [
-        "## Install LeRobot\n",
-        "This cell clones the `lerobot` repository from Hugging Face, installs FFmpeg (version 7.1.1), and installs the package in editable mode.\n"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "id": "dgLu7QT5tUik"
-      },
-      "outputs": [],
-      "source": [
-        "!git clone https://github.com/huggingface/lerobot.git\n",
-        "!conda install ffmpeg=7.1.1 -c conda-forge\n",
-        "!cd lerobot && pip install -e ."
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "Q8Sn2wG4wldo"
-      },
-      "source": [
-        "## Weights & Biases login\n",
-        "This cell logs you into Weights & Biases (wandb) to enable experiment tracking and logging."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "id": "PolVM_movEvp"
-      },
-      "outputs": [],
-      "source": [
-        "!wandb login"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "zTWQAgX9xseE"
-      },
-      "source": [
-        "## Install SmolVLA dependencies"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "id": "DiHs0BKwxseE"
-      },
-      "outputs": [],
-      "source": [
-        "!cd lerobot && pip install -e \".[smolvla]\""
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "IkzTo4mNwxaC"
-      },
-      "source": [
-        "## Start training SmolVLA with LeRobot\n",
-        "\n",
-        "This cell runs the `train.py` script from the `lerobot` library to train a robot control policy.  \n",
-        "\n",
-        "Make sure to adjust the following arguments to your setup:\n",
-        "\n",
-        "1. `--dataset.repo_id=YOUR_HF_USERNAME/YOUR_DATASET`:  \n",
-        "   Replace this with the Hugging Face Hub repo ID where your dataset is stored, e.g., `pepijn223/il_gym0`.\n",
-        "\n",
-        "2. `--batch_size=64`: means the model processes 64 training samples in parallel before doing one gradient update. Reduce this number if you have a GPU with low memory.\n",
-        "\n",
-        "3. `--output_dir=outputs/train/...`:  \n",
-        "   Directory where training logs and model checkpoints will be saved.\n",
-        "\n",
-        "4. `--job_name=...`:  \n",
-        "   A name for this training job, used for logging and Weights & Biases.\n",
-        "\n",
-        "5. `--policy.device=cuda`:  \n",
-        "   Use `cuda` if training on an NVIDIA GPU. Use `mps` for Apple Silicon, or `cpu` if no GPU is available.\n",
-        "\n",
-        "6. `--wandb.enable=true`:  \n",
-        "   Enables Weights & Biases for visualizing training progress. You must be logged in via `wandb login` before running this."
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "id": "ZO52lcQtxseE"
-      },
-      "outputs": [],
-      "source": [
-        "!cd lerobot && python lerobot/scripts/train.py \\\n",
-        "  --policy.path=lerobot/smolvla_base \\\n",
-        "  --dataset.repo_id=${HF_USER}/mydataset \\\n",
-        "  --batch_size=64 \\\n",
-        "  --steps=20000 \\\n",
-        "  --output_dir=outputs/train/my_smolvla \\\n",
-        "  --job_name=my_smolvla_training \\\n",
-        "  --policy.device=cuda \\\n",
-        "  --wandb.enable=true"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "2PBu7izpxseF"
-      },
-      "source": [
-        "## Login into Hugging Face Hub\n",
-        "Now after training is done login into the Hugging Face hub and upload the last checkpoint"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "id": "8yu5khQGIHi6"
-      },
-      "outputs": [],
-      "source": [
-        "!huggingface-cli login"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "id": "zFMLGuVkH7UN"
-      },
-      "outputs": [],
-      "source": [
-        "!huggingface-cli upload ${HF_USER}/my_smolvla \\\n",
-        "  /content/lerobot/outputs/train/my_smolvla/checkpoints/last/pretrained_model"
-      ]
-    }
-  ],
-  "metadata": {
-    "accelerator": "GPU",
-    "colab": {
-      "gpuType": "A100",
-      "machine_shape": "hm",
-      "provenance": []
-    },
-    "kernelspec": {
-      "display_name": "Python 3",
-      "name": "python3"
-    },
-    "language_info": {
-      "name": "python"
-    }
-  },
-  "nbformat": 4,
-  "nbformat_minor": 0
-}

README.md CHANGED Viewed

@@ -1,33 +1,21 @@
 ---
 pipeline_tag: robotics
 tags:
-- smolvla
 library_name: lerobot
-datasets:
-- lerobot/svla_so101_pickplace
 ---
-## SmolVLA: A vision-language-action model for affordable and efficient robotics
-Resources and technical documentation:
-[SmolVLA Paper](https://huggingface.co/papers/2506.01844)
-[SmolVLA Blogpost](https://huggingface.co/blog/smolvla)
-[Code](https://github.com/huggingface/lerobot/blob/main/lerobot/common/policies/smolvla/modeling_smolvla.py)
-[Train using Google Colab Notebook](https://colab.research.google.com/github/huggingface/notebooks/blob/main/lerobot/training-smolvla.ipynb#scrollTo=ZO52lcQtxseE)
-[SmolVLA HF Documentation](https://huggingface.co/docs/lerobot/smolvla)
 Designed by Hugging Face.
 This model has 450M parameters in total.
 You can use inside the [LeRobot library](https://github.com/huggingface/lerobot).
-Before proceeding to the next steps, you need to properly install the environment by following [Installation Guide](https://huggingface.co/docs/lerobot/installation) on the docs.
 Install smolvla extra dependencies:
 ```bash
 pip install -e ".[smolvla]"
@@ -36,25 +24,23 @@ pip install -e ".[smolvla]"
 Example of finetuning the smolvla pretrained model (`smolvla_base`):
 ```bash
 python lerobot/scripts/train.py \
-  --policy.path=lerobot/smolvla_base \
-  --dataset.repo_id=lerobot/svla_so101_pickplace \
-  --batch_size=64 \
-  --steps=20000 \
-  --output_dir=outputs/train/my_smolvla \
-  --job_name=my_smolvla_training \
-  --policy.device=cuda \
-  --wandb.enable=true
 ```
 Example of finetuning the smolvla neural network with pretrained VLM and action expert
 intialized from scratch:
 ```bash
 python lerobot/scripts/train.py \
-  --dataset.repo_id=lerobot/svla_so101_pickplace \
-  --batch_size=64 \
-  --steps=200000 \
-  --output_dir=outputs/train/my_smolvla \
-  --job_name=my_smolvla_training \
-  --policy.device=cuda \
-  --wandb.enable=true
 ```

 ---
 pipeline_tag: robotics
 tags:
+- lerobot
 library_name: lerobot
 ---
+SmolVLA: A vision-language-action model for affordable and efficient robotics
+[Paper](https://huggingface.co/papers/2506.01844)
+[Code](https://github.com/huggingface/lerobot)
 Designed by Hugging Face.
 This model has 450M parameters in total.
 You can use inside the [LeRobot library](https://github.com/huggingface/lerobot).
 Install smolvla extra dependencies:
 ```bash
 pip install -e ".[smolvla]"
 Example of finetuning the smolvla pretrained model (`smolvla_base`):
 ```bash
 python lerobot/scripts/train.py \
+--policy.path=lerobot/smolvla_base \
+--dataset.repo_id=danaaubakirova/svla_so100_task1_v3 \
+--batch_size=64 \
+--steps=200000
 ```
 Example of finetuning the smolvla neural network with pretrained VLM and action expert
 intialized from scratch:
 ```bash
 python lerobot/scripts/train.py \
+--policy.type=smolvla \
+--dataset.repo_id=danaaubakirova/svla_so100_task1_v3 \
+--batch_size=64 \
+--steps=200000
+```
+Example of using the smolvla pretrained model outside LeRobot training framework:
+```python
+policy = SmolVLAPolicy.from_pretrained("lerobot/smolvla_base")
 ```

collage_small.gif DELETED Viewed

Git LFS Details

SHA256: c43f022bf1fdfbac82841ef95c7a99a7ead13ea67f2a7202f2fe57148f352c83
Pointer size: 132 Bytes
Size of remote file: 8.01 MB

config.json CHANGED Viewed

@@ -1,6 +1,11 @@
 {
     "type": "smolvla",
     "n_obs_steps": 1,
     "input_features": {
         "observation.state": {
             "type": "STATE",
@@ -8,7 +13,7 @@
                 6
             ]
         },
-        "observation.images.camera1": {
             "type": "VISUAL",
             "shape": [
                 3,
@@ -16,7 +21,7 @@
                 256
             ]
         },
-        "observation.images.camera2": {
             "type": "VISUAL",
             "shape": [
                 3,
@@ -24,7 +29,7 @@
                 256
             ]
         },
-        "observation.images.camera3": {
             "type": "VISUAL",
             "shape": [
                 3,
@@ -41,20 +46,8 @@
             ]
         }
     },
-    "device": "cuda",
-    "use_amp": false,
-    "push_to_hub": true,
-    "repo_id": null,
-    "private": null,
-    "tags": null,
-    "license": null,
     "chunk_size": 50,
-    "n_action_steps": 50,
-    "normalization_mapping": {
-        "VISUAL": "IDENTITY",
-        "STATE": "MEAN_STD",
-        "ACTION": "MEAN_STD"
-    },
     "max_state_dim": 32,
     "max_action_dim": 32,
     "resize_imgs_with_padding": [
@@ -83,14 +76,11 @@
     "scheduler_decay_lr": 2.5e-06,
     "vlm_model_name": "HuggingFaceTB/SmolVLM2-500M-Video-Instruct",
     "load_vlm_weights": true,
-    "add_image_special_tokens": false,
     "attention_mode": "cross_attn",
     "prefix_length": 0,
     "pad_language_to": "max_length",
     "num_expert_layers": 0,
     "num_vlm_layers": 16,
     "self_attn_every_n_layers": 2,
-    "expert_width_multiplier": 0.75,
-    "min_period": 0.004,
-    "max_period": 4.0
-}

 {
     "type": "smolvla",
     "n_obs_steps": 1,
+    "normalization_mapping": {
+        "VISUAL": "IDENTITY",
+        "STATE": "MEAN_STD",
+        "ACTION": "MEAN_STD"
+    },
     "input_features": {
         "observation.state": {
             "type": "STATE",
                 6
             ]
         },
+        "observation.image2": {
             "type": "VISUAL",
             "shape": [
                 3,
                 256
             ]
         },
+        "observation.image": {
             "type": "VISUAL",
             "shape": [
                 3,
                 256
             ]
         },
+        "observation.image3": {
             "type": "VISUAL",
             "shape": [
                 3,
             ]
         }
     },
     "chunk_size": 50,
+    "n_action_steps": 1,
     "max_state_dim": 32,
     "max_action_dim": 32,
     "resize_imgs_with_padding": [
     "scheduler_decay_lr": 2.5e-06,
     "vlm_model_name": "HuggingFaceTB/SmolVLM2-500M-Video-Instruct",
     "load_vlm_weights": true,
     "attention_mode": "cross_attn",
     "prefix_length": 0,
     "pad_language_to": "max_length",
     "num_expert_layers": 0,
     "num_vlm_layers": 16,
     "self_attn_every_n_layers": 2,
+    "expert_width_multiplier": 0.75
+}

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:7cd549ac2351fb069c0ddb3c34ad2d09cfc92b56a15dccdfc2e41467aaca01eb
-size 906712520

 version https://git-lfs.github.com/spec/v1
+oid sha256:8f8dc071d5b933e79edd2b73b8d6b5cca482ef0437c099ea3ec13ab978a38fc8
+size 906720008

policy_postprocessor.json DELETED Viewed

@@ -1,32 +0,0 @@
-{
-  "name": "policy_postprocessor",
-  "steps": [
-    {
-      "registry_name": "unnormalizer_processor",
-      "config": {
-        "eps": 1e-08,
-        "features": {
-          "action": {
-            "type": "ACTION",
-            "shape": [
-              6
-            ]
-          }
-        },
-        "norm_map": {
-          "VISUAL": "IDENTITY",
-          "STATE": "MEAN_STD",
-          "ACTION": "MEAN_STD"
-        }
-      },
-      "state_file": "policy_postprocessor_step_0_unnormalizer_processor.safetensors"
-    },
-    {
-      "registry_name": "device_processor",
-      "config": {
-        "device": "cpu",
-        "float_dtype": null
-      }
-    }
-  ]
-}

policy_postprocessor_step_0_unnormalizer_processor.safetensors DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:490ab239d96e263687c0b2e386a0afbc235a2eceb9857c36ed32f2f162a3e7c8
-size 640

policy_preprocessor.json DELETED Viewed

@@ -1,87 +0,0 @@
-{
-  "name": "policy_preprocessor",
-  "steps": [
-    {
-      "registry_name": "rename_observations_processor",
-      "config": {
-        "rename_map": {}
-      }
-    },
-    {
-      "registry_name": "to_batch_processor",
-      "config": {}
-    },
-    {
-      "registry_name": "smolvla_new_line_processor",
-      "config": {}
-    },
-    {
-      "registry_name": "tokenizer_processor",
-      "config": {
-        "max_length": 48,
-        "task_key": "task",
-        "padding_side": "right",
-        "padding": "max_length",
-        "truncation": true,
-        "tokenizer_name": "HuggingFaceTB/SmolVLM2-500M-Video-Instruct"
-      }
-    },
-    {
-      "registry_name": "device_processor",
-      "config": {
-        "device": "cuda",
-        "float_dtype": null
-      }
-    },
-    {
-      "registry_name": "normalizer_processor",
-      "config": {
-        "eps": 1e-08,
-        "features": {
-          "observation.state": {
-            "type": "STATE",
-            "shape": [
-              6
-            ]
-          },
-          "observation.image2": {
-            "type": "VISUAL",
-            "shape": [
-              3,
-              256,
-              256
-            ]
-          },
-          "observation.image": {
-            "type": "VISUAL",
-            "shape": [
-              3,
-              256,
-              256
-            ]
-          },
-          "observation.image3": {
-            "type": "VISUAL",
-            "shape": [
-              3,
-              256,
-              256
-            ]
-          },
-          "action": {
-            "type": "ACTION",
-            "shape": [
-              6
-            ]
-          }
-        },
-        "norm_map": {
-          "VISUAL": "IDENTITY",
-          "STATE": "MEAN_STD",
-          "ACTION": "MEAN_STD"
-        }
-      },
-      "state_file": "policy_preprocessor_step_5_normalizer_processor.safetensors"
-    }
-  ]
-}

policy_preprocessor_step_5_normalizer_processor.safetensors DELETED Viewed

@@ -1,3 +0,0 @@
-version https://git-lfs.github.com/spec/v1
-oid sha256:490ab239d96e263687c0b2e386a0afbc235a2eceb9857c36ed32f2f162a3e7c8
-size 640

train_config.json ADDED Viewed

	@@ -0,0 +1,196 @@

+{
+    "dataset": {
+        "repo_id": "satvikahuja/mixer_on_off_new_1,aergogo/so100_pick_place,andy309/so100_0314_fold_cloths,jchun/so100_pickplace_small_20250323_120056,astroyat/cube,Ofiroz91/so_100_cube2bowl,HappyPablo/dec3_data2,ZCM5115/so100_1210,francescocrivelli/orange_feeding,francescocrivelli/carrot_eating,0x00raghu/toffee_red,0x00raghu/toffee_red_2,0x00raghu/toffee_red_3__,0x00raghu/toffee_blue,0x00raghu/toffee_blue_2,0x00raghu/toffee_to_hand_1,0x00raghu/toffee_to_hand_2,liyitenga/so100_bi_hello,liyitenga/so100_bi_giveme5,ZCM5115/so100_2Arm3cameras_movebox,pranavsaroha/so100_carrot_1,pranavsaroha/so100_carrot_3,pranavsaroha/so100_carrot_4,maximilienroberti/so100_lego_red_box,pranavsaroha/so100_squishy,rabhishek100/so100_train_dataset,pranavsaroha/so100_squishy100,swarajgosavi/kikobot_pusht_real_v2,pandaRQ/pickmed,swarajgosavi/act_kikobot_pusht_real,pranavsaroha/so100_squishy2colors,pranavsaroha/so100_squishy2colors_1,Chojins/chess_game_001_white,jmrog/so100_sweet_pick,Chojins/chess_game_002_white,pranavsaroha/so100_squishy2colors_2_new,Chojins/chess_game_003_white,aractingi/pick_place_lego_cube,Chojins/chess_game_004_white,Chojins/chess_game_005_white,Chojins/chess_game_006_white,Chojins/chess_game_007_white,koenvanwijk/blue2,jlitch/so100multicam3,koenvanwijk/blue52,jlitch/so100multicam6,aractingi/pick_place_lego_cube_1,jlitch/so100multicam7,vladfatu/so100_ds,Chojins/chess_game_000_white,HITHY/so100-kiwi,HITHY/so100_peach1,HITHY/so100_redstrawberry,satvikahuja/orange_mixer_1,satvikahuja/mixer_on_off,satvikahuja/orange_pick_place_new1,satvikahuja/mixer_on_off_new,danmac1/real_real332,FeiYjf/Makalu_push,liyitenga/so100_pick_taffy1,chmadran/so100_dataset04,FeiYjf/Maklu_dataset,FeiYjf/new_Dataset,liyitenga/so100_pick_taffy2,satvikahuja/mixer_on_off_new_4,CSCSXX/pick_place_cube_1.17,liyitenga/so100_pick_taffy3,liyitenga/so100_pick_taffy4,yuz1wan/so100_pick_pink,yuz1wan/so100_pick_wahaha,yuz1wan/so100_pp_pink,yuz1wan/so100_pour_cup,liyitenga/so100_pick_taffy5,liyitenga/so100_pick_taffy6,yuz1wan/so100_button,yuz1wan/so100_pickplace,liyitenga/so100_pick_taffy7,FeiYjf/push_gg,FeiYjf/push_0094,swarajgosavi/act_kikobot_block_real,liyitenga/so100_pick_taffy8,phospho-ai/OrangeBrick3Cameras,vaishanthr/toy_pick_place,SeanLMH/so100_picknplace_v2,pepijn223/yellow_lego_in_box1,DimiSch/so100_50ep_2,DimiSch/so100_50ep_3,SeanLMH/so100_picknplace,nbaron99/so100_pick_and_place2,chmadran/so100_dataset08,vaishanthr/toy_pickplace_50ep,Beegbrain/pick_place_green_block_lr,Ityl/so100_recording1,vaishanthr/toy_pickplace,ad330/so100_box_pickPlace,Beegbrain/so100_put_cube_cup,aractingi/push_green_cube_hf,aractingi/push_green_cube_hf_cropped_resized,carpit680/giraffe_task,carpit680/giraffe_sock_demo_1,DimiSch/so100_terra_50_2,carpit680/giraffe_sock_demo_2,aractingi/push_cube_to_face_reward,aractingi/push_cube_to_face_reward_cropped_resized,aractingi/push_cube_reward_data,aractingi/push_cube_reward_data_cropped_resized,aractingi/push_cube_offline_data_cropped_resized,aractingi/push_cube_front_side_reward,aractingi/push_cube_front_side_reward_cropped_resized,aractingi/push_cube_front_side_reward_long,aractingi/push_cube_front_side_reward_long_cropped_resized,aractingi/push_cube_reward,aractingi/push_cube_reward_cropped_resized,aractingi/push_cube_square_reward_cropped_resized,aractingi/push_cube_square_reward_1,aractingi/push_cube_square_reward_1_cropped_resized,aractingi/push_cube_square_light_reward,aractingi/push_cube_square_light_offline_demo,aractingi/push_cube_square_light_offline_demo_cropped_resized,denghj/dataset_red_tape01,aractingi/push_cube_square_offline_demo,aractingi/push_cube_square_offline_demo_cropped_resized,Beegbrain/stack_two_cubes,FeiYjf/Test_NNNN,LegrandFrederic/Orange-brick-lower-resolution,aractingi/pick_place_lego_cube_cropped_resized,aractingi/push_cube_overfit,aractingi/push_cube_overfit_cropped_resized,HITHY/so100_peach,zaringleb/so100_cube_2,andreasBihlmaier/dual_arm_transfer_2025_02_16,zaringleb/so100_cube_4_binary,1g0rrr/reward_pickplace1,1g0rrr/reward_pickplace1_cropped_resized,FeiYjf/Hold_Pieces,FeiYjf/Grab_Pieces,hegdearyandev/so100_eraser_cup_v1,jbraumann/so100_1902,liyitenga/so100_pick_taffy10,mikechambers/block_cup_5,zaringleb/so100_cube_5_linear,yuz1wan/so100_pickplace_0223_2,yuz1wan/so100_pickplace_0223_3,samsam0510/mj_data_temp,samsam0510/tape_insert_1,samsam0510/tape_insert_2,pengjunkun/so100_push_to_hole,Deason11/Random_Kitchen,1g0rrr/reward_dataset_name2,1g0rrr/reward_dataset_name2_cropped_resized,1g0rrr/offline_dataset_name2,1g0rrr/offline_dataset_name2_cropped_resized,aractingi/push_cube_simp_cropped_resized,danielkr452/so100_work6,Loki0929/so100_100,yuz1wan/so100_fold_0227_1,yuz1wan/so100_fold_0227_2,speedyyoshi/so100_grasp_pink_block,lirislab/stack_two_red_cubes,lirislab/red_cube_into_mug,lirislab/green_lego_block_into_mug,lirislab/green_lego_block_into_mug_easy,kevin510/lerobot-cat-toy-placement,NONHUMAN-RESEARCH/SOARM100_TASK_VENDA_BOX,wangjl1512/pour_water,airthebear/so100_GL,zijian2022/noticehuman1,zijian2022/noticehuman2,kantine/so100_kapla_tower6,zijian2022/noticehuman5,zijian2022/llm40,Ashton3/lerobot-aloha,zijian2022/noticehuman50,AaronNewman/screwdriver_task_batch1,AaronNewman/screwdriver_task_batch2,AaronNewman/screwdriver_task_batch3,zijian2022/noticehuman60,zijian2022/noticehuman70,Bartm3/tape_to_bin,liuhuanjim013/so100_th_1,Pi-robot/barbecue_flip,Pi-robot/barbecue_put,wangjl1512/doll,sshh11/so100_orange_50ep_1,sshh11/so100_orange_50ep_2,DorayakiLin/so100_pick_cube_in_box,Bartm3/tape_to_bin2,luke250305/play_dice_250311.1,andy309/so100_0311_1152,sihyun77/suho_so100,sihyun77/si_so100,shreyasgite/so100_base_left,sihyun77/suho_red,liuhuanjim013/so100_block,andy309/so100_0313_no_wrist_camera,zijian2022/l9,zijian2022/n1_2,DorayakiLin/so100_stack_cube,andy309/so100_0313_no_wrist_camera_with_two_arms_cloths,joaoocruz00/so100_makeitD1,zijian2022/l10_1,zijian2022/l10_5,sihyun77/suho_red2,sihyun77/suho_angel,sihyun77/sihyun_king,acrampette/third_arm_01,Winster/so100_cube,1g0rrr/sam_openpi03,thedevansh/mar16_1336,hkphoooey/throw_stuffie,doujiangwang/task1_10epi_100000step,sihyun77/sihyun_3_17_1,acrampette/third_arm_02,imsyed00/so100_yellowbowl_pickplace_1,kumarhans/so100_tape_task,sihyun77/sihyun_main,doujiangwang/task2_10epi_100000step,kantine/industrial_robothon_buttons_expert,kantine/industrial_robothon_buttons_anomaly,kantine/industrial_robothon_hatchAndProbe_expert,kantine/industrial_robothon_hatchAndProbe_anomaly,Odog16/so100_tea_towel_folding_v1,zijian2022/so100_318,zijian2022/so100_318_1,Congying1112/so100_place_blue_bottle_with_two_cameras,Congying1112/so100_place_blue_bottle_with_two_cameras2,Congying1112/so100_place_blue_bottle_with_single_camera,pietroom/first_task_short,kantine/industrial_screws_sorting_expert,kantine/industrial_screws_sorting_anomaly,pietroom/second_task,zijian2022/c0,doujiangwang/task4_10epi_100000step,Congying1112/so100_switch_with_onhand_camera,HYAIYN/so100_get_orange_10epi,doujiangwang/task5_10epi_100000step,1g0rrr/sam_openpi_cube_low10,1g0rrr/sam_openpi_cube_top10,1g0rrr/sam_openpi_wire10,1g0rrr/sam_openpi_solder1,1g0rrr/sam_openpi_solder2,wcode/so100_put_pen_50,jchun/so100_pickplace_small_20250322_193929,bnarin/so100_tic_tac_toe_we_do_it_live,dc2ac/so100-t5,chmadran/so100_home_dataset,baladhurgesh97/so100_final_picking_3,bnarin/so100_tic_tac_toe_move_0_0,bnarin/so100_tic_tac_toe_move_1_0,bnarin/so100_tic_tac_toe_move_2_1,bnarin/so100_tic_tac_toe_move_4_0,zaringleb/so100_cube_6_2d,andlyu/so100_indoor_0,andlyu/so100_indoor_2,Winster/so100_sim,badwolf256/so100_twin_cam_duck,Congying1112/so100_simplepick_with_2_cameras_from_top,andlyu/so100_indoor_4,Zak-Y/so100_grap_dataset,kantine/domotic_pouringCoffee_expert,kantine/domotic_pouringCoffee_anomaly,lucasngoo/so100_strawberry_grape,kantine/domotic_makingCoffee_expert,kantine/domotic_makingCoffee_anomaly,ZGGZZG/so100_drop1,kantine/industrial_soldering_expert,kantine/industrial_soldering_anomaly,Yotofu/so100_sweeper_shoes,kantine/domotic_dishTidyUp_expert,kantine/domotic_dishTidyUp_anomaly,kantine/domotic_groceriesSorting_expert,kantine/domotic_groceriesSorting_anomaly,badwolf256/so100_twin_cam_duck_v2,kantine/domotic_vegetagblesAndFruitsSorting_expert,kantine/domotic_vegetagblesAndFruitsSorting_anomaly,kantine/domotic_setTheTable_expert,kantine/domotic_setTheTable_anomaly,therarelab/so100_pick_place,abhisb/so100_51_ep,andlyu/so100_indoor_val_0,allenchienxxx/so100Test,lizi178119985/so100_jia,badwolf256/so100_twin_cam_duck_v3,andrewcole712/so100_tape_bin_place,Gano007/so100_lolo,Zak-Y/so100_three_cameras_dataset,Gano007/so100_doliprane,XXRRSSRR/so100_v3_num_episodes_50,zijian2022/assemblyarm2,ganker5/so100_action_20250403,andlyu/so100_indoor_val2,Gano007/so100_gano,paszea/so100_whale_grab,paszea/so100_whale,Clementppr/lerobot_pick_and_place_dataset_world_model,andlyu/so100_indoor_10,RasmusP/so100_dataset50ep_a,RasmusP/so100_dataset50ep,Gano007/so100_second,zaringleb/so100_cude_linear_and_2d_comb,dsfsg/grasp_pens,zijian2022/digitalfix,zijian2022/digitalfix2,zijian2022/digitalfix3,T1g3rGE/so100_pickplace_small_20250407_171912,sihyun77/mond_13,abokinala/sputnik_100_11_pick_place_container,dsfsg/bring_bottle,abokinala/sputnik_100_12_pick_place_container,Mwuqiu/so100_0408,AK51/4090_01,356c/so100_rope_reposition_1,paszea/so100_lego_mix,abokinala/sputnik_100_14_pick_place_container,abokinala/sputnik_100_23_pick_place_surface,jiajun001/eraser00_2,jlesein/TestBoulon2,duthvik/sputnik_100_31_pour_liquid,duthvik/sputnik_100_24_pick_place_surface,duthvik/sputnik_100_25_pick_place_surface,duthvik/sputnik_100_17_pick_place_container,duthvik/sputnik_100_26_pick_place_surface,VoicAndrei/so100_banana_to_plate_rebel_full,isadev/bougies1,danaaubakirova/so100_task_1,danaaubakirova/so100_task_2,danaaubakirova/so100_task_3,danaaubakirova/so100_task_4,sixpigs1/so100_pick_cube_in_box_error,sixpigs1/so100_push_cube_error,sixpigs1/so100_pull_cube_error,isadev/bougies2,therarelab/med_dis_rare_6,duthvik/sputnik_100_27_pick_place_surface,zijian2022/closer3,duthvik/sputnik_100_41_custom_tasks,duthvik/sputnik_100_42_custom_tasks,duthvik/sputnik_100_43_custom_tasks,duthvik/sputnik_100_44_custom_tasks,duthvik/sputnik_100_51_kitchen_tasks,duthvik/sputnik_100_52_kitchen_tasks,duthvik/sputnik_100_53_kitchen_tasks,duthvik/sputnik_100_45_custom_tasks,duthvik/sputnik_100_32_pour_liquid,duthvik/sputnik_100_29_pick_place_surface,duthvik/sputnik_100_18_pick_place_container,sixpigs1/so100_pull_cube_by_tool_error,sixpigs1/so100_insert_cylinder_error,abokinala/sputnik_100_54_kitchen_tasks,abokinala/sputnik_100_55_kitchen_tasks,m1b/so100_bluelego,abokinala/sputnik_100_46_custom_tasks,m1b/so100_bluelego_updt,kantine/flip_A0,kantine/flip_A1,kantine/flip_A2,kantine/flip_A3,lirislab/guess_who_no_cond,kantine/flip_A4,kantine/flip_A5,lirislab/guess_who_lighting,nguyen-v/so100_press_red_button,nguyen-v/so100_bimanual_grab_lemon_put_in_box2,pierfabre/cow,nguyen-v/press_red_button_new,nguyen-v/so100_rotate_red_button,Cidoyi/so100_all_notes,roboticshack/team10-red-block,Cidoyi/so100_all_notes_1,roboticshack/team_5-QuiEstCe_everyBox,roboticshack/team11_pianobot,roboticshack/team2-guess_who_so100,roboticshack/team2-guess_who_so100_light,roboticshack/team2-guess_who_so100_edge_case,roboticshack/team2-guess_who_less_ligth,Cidoyi/so100_all_notes_3,dsfsg/grasp_pen_and_bottle,abokinala/sputnik_100_60_kitchen_tasks,abokinala/sputnik_100_58_kitchen_tasks,danaaubakirova/so100_v2_task_1,danaaubakirova/so100_v2_task_2,danaaubakirova/so100_v2_task_3,danaaubakirova/so100_v2_task_4,zijian2022/force1,zijian2022/force2,zijian2022/force3,jiajun001/eraser00_3,zijian2022/bi2,zijian2022/bi1,zijian2022/hand1,Setchii/so100_grab_ball,MossProphet/so100_square-1-2-3.2,pierfabre/rabbit,bensprenger/right_arm_p_brick_in_box_with_y_noise_v0,pierfabre/horse,pierfabre/pig2,pierfabre/pig3,pierfabre/cow2,pierfabre/sheep,Chojins/chess_game_009_white,sihyun77/suho_3_17_1,sihyun77/sihyun_3_17_2,sihyun77/suho_3_17_3,sihyun77/sihyun_3_17_5,Odog16/so100_cube_drop_pick_v1,sihyun77/sihyun_main_2,sihyun77/suho_main_2,Bartm3/dice2,sihyun77/sihyun_main_3,Loki0929/so100_duck,pietroom/holdthis,pietroom/actualeasytask,Beegbrain/pick_lemon_and_drop_in_bowl,Beegbrain/sweep_tissue_cube,zijian2022/321,gxy1111/so100_pick_place,Odog16/so100_cube_stacking_v1,sihyun77/mond_1,andlyu/so100_indoor_1,andlyu/so100_indoor_3,frk2/so100large,lirislab/sweep_tissue_cube,lirislab/lemon_into_bowl,lirislab/red_cube_into_green_lego_block,lirislab/red_cube_into_blue_cube,00ri/so100_battery,frk2/so100largediffcam,FsqZ/so100_1,ZGGZZG/so100_drop0,Chojins/chess_game_000_white_red,smanni/train_so100_fluffy_box,ganker5/so100_push_20250328,ganker5/so100_dataline_0328,ganker5/so100_color_0328,CrazyYhang/A1234-B-C_mvA2B,RasmusP/so100_Orange2Green,sixpigs1/so100_pick_cube_in_box,ganker5/so100_push_20250331,ganker5/so100_dataline_20250331,lirislab/put_caps_into_teabox,lirislab/close_top_drawer_teabox,lirislab/open_top_drawer_teabox,lirislab/unfold_bottom_right,lirislab/push_cup_target,lirislab/put_banana_bowl,Chojins/chess_game_001_blue_stereo,Chojins/chess_game_001_red_stereo,ganker5/so100_toy_20250402,Gano007/so100_medic,00ri/so100_battery_bin_center,paszea/so100_whale_2,lirislab/fold_bottom_right,lirislab/put_coffee_cap_teabox,therarelab/so100_pick_place_2,paszea/so100_whale_3,paszea/so100_whale_4,paszea/so100_lego,LemonadeDai/so100_coca,zijian2022/backgrounda,zijian2022/backgroundb,356c/so100_nut_sort_1,Mwuqiu/so100_0408_muti,aimihat/so100_tape,lirislab/so100_demo,356c/so100_duck_reposition_1,zijian2022/sort1,weiye11/so100_410_zwy,VoicAndrei/so100_banana_to_plate_only,sixpigs1/so100_stack_cube_error,isadev/bougies3,zijian2022/close3,bensprenger/left_arm_yellow_brick_in_box_v0,lirislab/guess_who_so100,bensprenger/left_arm_yellow_brick_in_box_with_purple_noise_v0,roboticshack/team16-can-stacking,zijian2022/insert2,roboticshack/team-7-right-arm-grasp-tape,Jiangeng/so100_413,roboticshack/team9-pick_cube_place_static_plate,AndrejOrsula/lerobot_double_ball_stacking_random,roboticshack/left-arm-grasp-lego-brick,roboticshack/team-7-left-arm-grasp-motor,roboticshack/team9-pick_chicken_place_plate,roboticshack/team13-two-balls-stacking,tkc79/so100_lego_box_1,roboticshack/team13-three-balls-stacking,pierfabre/chicken,roboticshack/team16-water-pouring,ad330/cubePlace,Jiafei1224/so100_pa222per,paszea/so100_lego_2cam,bensprenger/chess_game_001_blue_stereo,Mohamedal/put_banana,tkc79/so100_lego_box_2,samanthalhy/so100_herding_1,jlesein/TestBoulon7,pranavsaroha/so100_onelego2,pranavsaroha/so100_onelego3,pranavsaroha/so100_carrot_2,vladfatu/so100_above,koenvanwijk/orange50-1,CSCSXX/pick_place_cube_1.18,dragon-95/so100_sorting,dragon-95/so100_sorting_1,nbaron99/so100_pick_and_place4,Beegbrain/pick_place_green_block,dragon-95/so100_sorting_3,HITHY/so100_peach3,shreyasgite/so100_legocube_50,triton7777/so100_dataset_mix,NONHUMAN-RESEARCH/SOARM100_TASK_VENDA,mikechambers/block_cup_14,samsam0510/tooth_extraction_3,samsam0510/tooth_extraction_4,samsam0510/cube_reorientation_2,samsam0510/cube_reorientation_4,samsam0510/glove_reorientation_1,vladfatu/so100_office,pranavsaroha/so100_legos4,Ityl/so100_recording2,FeiYjf/new_GtoR,dragon-95/so100_sorting_2,HITHY/so100_peach4,jpata/so100_pick_place_tangerine,HITHY/so100_strawberry,shreyasgite/so100_base_env,koenvanwijk/orange50-variation-2,pranavsaroha/so100_carrot_5,pandaRQ/pick_med_1,aractingi/push_cube_offline_data,DorayakiLin/so100_pick_charger_on_tissue,zijian2022/noticehuman3,liuhuanjim013/so100_th",
+        "episodes": null,
+        "image_transforms": {
+            "enable": true,
+            "max_num_transforms": 10,
+            "random_order": false,
+            "transform_version": 0,
+            "image_size": 256,
+            "tfs": {
+                "resize_with_pad": {
+                    "weight": 1.0,
+                    "type": "ResizeWithPad",
+                    "kwargs": {
+                        "size": [
+                            256,
+                            256
+                        ]
+                    }
+                }
+            }
+        },
+        "local_files_only": true,
+        "use_imagenet_stats": false,
+        "video_backend": "pyav",
+        "sampling_weights": "",
+        "max_action_dim": 6,
+        "max_state_dim": 6,
+        "max_num_images": 3,
+        "max_image_dim": 256,
+        "train_on_all_features": true,
+        "features_version": 2,
+        "discard_first_n_frames": 0,
+        "min_fps": 30,
+        "max_fps": 30,
+        "discard_first_idle_frames": false,
+        "motion_threshold": 0.05,
+        "motion_window_size": 10,
+        "motion_buffer": 3
+    },
+    "env": null,
+    "policy": {
+        "type": "smolvla",
+        "n_obs_steps": 1,
+        "normalization_mapping": {
+            "VISUAL": "IDENTITY",
+            "STATE": "MEAN_STD",
+            "ACTION": "MEAN_STD"
+        },
+        "input_features": {
+            "observation.state": {
+                "type": "STATE",
+                "shape": [
+                    6
+                ]
+            },
+            "observation.image2": {
+                "type": "VISUAL",
+                "shape": [
+                    3,
+                    256,
+                    256
+                ]
+            },
+            "observation.image": {
+                "type": "VISUAL",
+                "shape": [
+                    3,
+                    256,
+                    256
+                ]
+            },
+            "observation.image3": {
+                "type": "VISUAL",
+                "shape": [
+                    3,
+                    256,
+                    256
+                ]
+            }
+        },
+        "output_features": {
+            "action": {
+                "type": "ACTION",
+                "shape": [
+                    6
+                ]
+            }
+        },
+        "chunk_size": 50,
+        "n_action_steps": 1,
+        "max_state_dim": 32,
+        "max_action_dim": 32,
+        "resize_imgs_with_padding": [
+            512,
+            512
+        ],
+        "empty_cameras": 0,
+        "adapt_to_pi_aloha": false,
+        "use_delta_joint_actions_aloha": false,
+        "tokenizer_max_length": 48,
+        "num_steps": 10,
+        "use_cache": true,
+        "freeze_vision_encoder": true,
+        "train_expert_only": true,
+        "train_state_proj": true,
+        "optimizer_lr": 0.0001,
+        "optimizer_betas": [
+            0.9,
+            0.95
+        ],
+        "optimizer_eps": 1e-08,
+        "optimizer_weight_decay": 1e-10,
+        "optimizer_grad_clip_norm": 10,
+        "scheduler_warmup_steps": 1000,
+        "scheduler_decay_steps": 30000,
+        "scheduler_decay_lr": 2.5e-06,
+        "vlm_model_name": "HuggingFaceTB/SmolVLM2-500M-Video-Instruct",
+        "load_vlm_weights": true,
+        "attention_mode": "cross_attn",
+        "prefix_length": 0,
+        "past_obs_keys": "image",
+        "pad_language_to": "max_length",
+        "num_expert_layers": 0,
+        "num_vlm_layers": 16,
+        "causal_action_attention_mask": true,
+        "self_attn_every_n_layers": 2,
+        "expert_width_multiplier": 0.75
+    },
+    "output_dir": "/lustre/fswork/projects/rech/dyf/ugz83ue/logs/lerobot/lerobot_so100_community_v1_v2_v3clean2_smolpi0_lr1e-4bs64steps400000gpus4freeze32_imgtoktrue_cross_attn_gap1_vlml16_causalacttrue_sa2_smolvlm2500_nobs1_expw0.75_feat2_lrvlm1e-4_trans0true_decaylr2.5e-630000_camfalse_fps3030_idlefalse",
+    "job_name": "smolvla",
+    "resume": false,
+    "overwrite": false,
+    "device": "cuda",
+    "use_amp": true,
+    "seed": 1000,
+    "num_workers": 4,
+    "batch_size": 64,
+    "eval_freq": 5000,
+    "log_freq": 200,
+    "save_checkpoint": true,
+    "save_freq": 20000,
+    "offline": {
+        "steps": 400000
+    },
+    "online": {
+        "steps": 0,
+        "rollout_n_episodes": 1,
+        "rollout_batch_size": 1,
+        "steps_between_rollouts": null,
+        "sampling_ratio": 0.5,
+        "env_seed": null,
+        "buffer_capacity": null,
+        "buffer_seed_size": 0,
+        "do_rollout_async": false
+    },
+    "use_policy_training_preset": true,
+    "optimizer": {
+        "type": "adamw",
+        "lr": 0.0001,
+        "weight_decay": 1e-10,
+        "grad_clip_norm": 10,
+        "betas": [
+            0.9,
+            0.95
+        ],
+        "eps": 1e-08
+    },
+    "scheduler": {
+        "type": "cosine_decay_with_warmup",
+        "num_warmup_steps": 1000,
+        "num_decay_steps": 30000,
+        "peak_lr": 0.0001,
+        "decay_lr": 2.5e-06
+    },
+    "eval": {
+        "n_episodes": 50,
+        "batch_size": 50,
+        "use_async_envs": false
+    },
+    "wandb": {
+        "enable": false,
+        "disable_artifact": false,
+        "project": "lerobot",
+        "entity": null,
+        "notes": null
+    },
+    "nccl_timeout": 9000,
+    "gradient_accumulation_steps": 1,
+    "torch_compile": true,
+    "save_on_eval": false,
+    "dataloader_drop_last": true,
+    "eval_mse": true,
+    "eval_mse_steps": 1000
+}