YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

RICL: Re-training (a VLA) for In-Context Learning

A RICL version of the openpi repository focused on RICL-Pi0-FAST-DROID.

Website | Arxiv

Installation

GIT_LFS_SKIP_SMUDGE=1 uv sync
source .venv/bin/activate
uv pip install tensorflow-datasets tensorflow-cpu autofaiss google-genai openai

Quickstart

Collect retrieval data for a new task as detailed under Collecting data > retrieval data for testing the VLA in a new task.

Then download our checkpoint and serve it with the above retrieval data as detailed under Serving RICL-Pi0-FAST-DROID in a new task > Serve the downloaded checkpoint

Collecting data

Collect demos on your franka droid robot. The demos must be setup in the following directory structures for priming (at training time) and retrieval (at testing time). Please use the original droid code to collect the demos.

Priming data for re-training a VLA for in-context learning

# The following example structure is expected for the collected demos:
# preprocessing/
# ├── collected_demos_training/
# │   ├── {YYYY-MM-DD}_{task_1_prompt}/
# │   │   ├── demo_0
# |   │   ├── demo_1
# |   │   ├── ...
# │   │── {YYYY-MM-DD}_{task_2_prompt}/
# │   │   ├── demo_0
# |   │   ├── demo_1
# |   │   ├── ...

Please note that the folder containing many demos for a task must have the aboev specified name format. Ensure that the task prompt uses underscores to separate words. The folders inside each task folder can be named anything. But note that each folder inside a task folder, which has all the information for one collected demo, must atleast contain the following:

# │   │   ├── demo_0
# │   │   │   ├── traj.h5
# │   │   │   ├── recordings/
# |   │   │   │   ├── frames/
# |   │   │   │   │   ├── hand_camera/
# |   │   │   │   │   │   ├── 000.jpg
# |   │   │   │   │   │   ├── 001.jpg
# |   │   │   │   │   │   ├── ...
# |   │   │   │   │   ├── varied_camera_1/
# |   │   │   │   │   │   ├── 000.jpg
# |   │   │   │   │   │   ├── 001.jpg
# |   │   │   │   │   │   ├── ...
# |   │   │   │   │   ├── varied_camera_2/
# |   │   │   │   │   │   ├── 000.jpg
# |   │   │   │   │   │   ├── 001.jpg
# |   │   │   │   │   │   ├── ...

where the number of jpg files (in each camera folder) is equal to the number of timesteps where the controller is active. You need to extract these from the svo files created by the droid code. The traj.h5 file is the one saved by the original droid code containing the proprioceptive data and actions. These are the only files we use. We ignore everything else in the demo folder.

Retrieval data for testing the VLA in a new task

Similarly, for retrieval data in a new task at test time, we expect the following structure:

# preprocessing/
# ├── collected_demos/
# │   ├── {YYYY-MM-DD}_{new_task_prompt}/
# │   │   ├── demo_0
# |   │   ├── demo_1
# |   │   ├── ...

Downloading our datasets

Priming (training) data: git clone https://huggingface.co/datasets/ricl-vla/collected_demos_training ./preprocessing/collected_demos_training

Retrieval (testing) data in many new tasks: git clone https://huggingface.co/datasets/ricl-vla/collected_demos ./preprocessing/collected_demos

Both of the above can also be found at this huggingface link.

Preprocessing [SKIP this step if you downloaded the above datasets from HF]

  • First cd into the folder
cd preprocessing
  • Process the priming demos for re-training a VLA for in-context learning
python process_collected_demos.py --dir_of_dirs=collected_demos_training
python retrieve_within_collected_demo_groups.py
  • Process the retrieval demos for testing the VLA in a new task
python process_collected_demos.py --dir_of_dirs=collected_demos

Re-training for in-context learning (RICL)

  • Compute norm stats after processing the priming demos as follows [SKIP this step if you downloaded the above datasets from HF]:
python scripts/setup_norm_states_for_ricl.py
  • Create RICL-Pi0-FAST-DROID
python scripts/train_pi0_fast_ricl.py pi0_fast_droid_ricl --exp-name={YOUR_EXPERIMENT_NAME_HERE} --overwrite

Serving RICL-Pi0-FAST-DROID in a new task

Serve RICL-Pi0-FAST-DROID's checkpoint after three epochs of training

The step corresponding to this for our dataset is 5400.

uv run scripts/serve_policy_ricl.py policy:checkpoint --policy.config=pi0_fast_droid_ricl --policy.dir=checkpoints/pi0_fast_droid_ricl/{YOUR_EXPERIMENT_NAME_HERE}/5400 --policy.demos_dir=preprocessing/collected_demos/{YYYY-MM-DD}_{new_task_prompt}

Serve the downloaded checkpoint

Download our checkpoint: git clone https://huggingface.co/ricl-vla/pi0_fast_droid_ricl_checkpoint

You can serve it as follows:

uv run scripts/serve_policy_ricl.py policy:checkpoint --policy.config=pi0_fast_droid_ricl --policy.dir=pi0_fast_droid_ricl_checkpoint --policy.demos_dir=preprocessing/collected_demos/{YYYY-MM-DD}_{new_task_prompt}

Our checkpoint can also be found at this huggingface link.

On the laptop connected to the franka droid robot

You also have to run the client in examples/droid/main_ricl.py on the laptop connexted to the franka droid robot following the original repository's instructions.

Finetuning like you pretrain to create RICL-Pi0-FAST-DROID-Finetuned

  • This requeires a bit more preprocessing of the retrieval demos in the new task
python retrieve_within_collected_demo_groups.py --folder_name=collected_demos
  • Create RICL-Pi0-FAST-DROID-Finetuned
python scripts/train_pi0_fast_ricl.py pi0_fast_droid_ricl___finetune_on_new_task --exp-name={YOUR_EXPERIMENT_NAME_HERE} --overwrite

Please make sure the config of pi0_fast_droid_ricl___finetune_on_new_task in src/training/config.py points to the correct retrieval demos folder that you would like to finetune on.

  • Serve RICL-Pi0-FAST-DROID-Finetuned
uv run scripts/serve_policy_ricl.py policy:checkpoint --policy.config=pi0_fast_droid_ricl___finetune_on_new_task --policy.dir=checkpoints/pi0_fast_droid_ricl___finetune_on_new_task/{YOUR_EXPERIMENT_NAME_HERE}/999 --policy.demos_dir=preprocessing/collected_demos/{YYYY-MM-DD}_{new_task_prompt}

Run the client on the laptop connected to the franka droid robot to test the finetuned policy.

Credits

This repository is based on the openpi repository, without which this work would not have been possible.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for doanh25032004/ricl