LeRobot documentation
π₀ (Pi0)
π₀ (Pi0)
π₀ is a Vision-Language-Action model for general robot control, from Physical Intelligence. The LeRobot implementation is adapted from their open source OpenPI repository.
Model Overview
π₀ represents a breakthrough in robotics as the first general-purpose robot foundation model developed by Physical Intelligence. Unlike traditional robot programs that are narrow specialists programmed for repetitive motions, π₀ is designed to be a generalist policy that can understand visual inputs, interpret natural language instructions, and control a variety of different robots across diverse tasks.
The Vision for Physical Intelligence
As described by Physical Intelligence, while AI has achieved remarkable success in digital domains, from chess-playing to drug discovery, human intelligence still dramatically outpaces AI in the physical world. To paraphrase Moravec’s paradox, winning a game of chess represents an “easy” problem for AI, but folding a shirt or cleaning up a table requires solving some of the most difficult engineering problems ever conceived. π₀ represents a first step toward developing artificial physical intelligence that enables users to simply ask robots to perform any task they want, just like they can with large language models.
Architecture and Approach
π₀ combines several key innovations:
- Flow Matching: Uses a novel method to augment pre-trained VLMs with continuous action outputs via flow matching (a variant of diffusion models)
- Cross-Embodiment Training: Trained on data from 8 distinct robot platforms including UR5e, Bimanual UR5e, Franka, Bimanual Trossen, Bimanual ARX, Mobile Trossen, and Mobile Fibocom
- Internet-Scale Pre-training: Inherits semantic knowledge from a pre-trained 3B parameter Vision-Language Model
- High-Frequency Control: Outputs motor commands at up to 50 Hz for real-time dexterous manipulation
Installation Requirements
Install LeRobot by following our Installation Guide.
Install Pi0 dependencies by running:
pip install -e ".[pi]"
Training Data and Capabilities
π₀ is trained on the largest robot interaction dataset to date, combining three key data sources:
- Internet-Scale Pre-training: Vision-language data from the web for semantic understanding
- Open X-Embodiment Dataset: Open-source robot manipulation datasets
- Physical Intelligence Dataset: Large and diverse dataset of dexterous tasks across 8 distinct robots
Usage
To use π₀ in LeRobot, specify the policy type as:
policy.type=pi0Training
For training π₀, you can use the standard LeRobot training script with the appropriate configuration:
lerobot-train \
--dataset.repo_id=your_dataset \
--policy.type=pi0 \
--output_dir=./outputs/pi0_training \
--job_name=pi0_training \
--policy.pretrained_path=lerobot/pi0_base \
--policy.repo_id=your_repo_id \
--policy.compile_model=true \
--policy.gradient_checkpointing=true \
--policy.dtype=bfloat16 \
--policy.freeze_vision_encoder=false \
--policy.train_expert_only=false \
--steps=3000 \
--policy.device=cuda \
--batch_size=32Key Training Parameters
--policy.compile_model=true: Enables model compilation for faster training--policy.gradient_checkpointing=true: Reduces memory usage significantly during training--policy.dtype=bfloat16: Use mixed precision training for efficiency--batch_size=32: Batch size for training, adapt this based on your GPU memory--policy.pretrained_path=lerobot/pi0_base: The base π₀ model you want to finetune, options are:- lerobot/pi0_base
- lerobot/pi0_libero (specifically trained on the Libero dataset)
Training Parameters Explained
| Parameter | Default | Description |
|---|---|---|
freeze_vision_encoder | false | Do not freeze the vision encoder |
train_expert_only | false | Do not freeze the VLM, train all parameters |
💡 Tip: Setting train_expert_only=true freezes the VLM and trains only the action expert and projections, allowing finetuning with reduced memory usage.
Relative Actions
By default, π₀ predicts absolute actions. You can enable relative actions so the model predicts offsets relative to the current robot state. This can improve training stability for certain setups.
To use relative actions, first recompute your dataset stats in relative space via the CLI:
lerobot-edit-dataset \
--repo_id your_dataset \
--operation.type recompute_stats \
--operation.relative_action true \
--operation.chunk_size 50 \
--operation.relative_exclude_joints "['gripper']" \
--push_to_hub trueOr equivalently in Python:
from lerobot.datasets.lerobot_dataset import LeRobotDataset
from lerobot.datasets.dataset_tools import recompute_stats
dataset = LeRobotDataset("your_dataset")
recompute_stats(dataset, relative_action=True, chunk_size=50, relative_exclude_joints=["gripper"])
dataset.push_to_hub()The chunk_size should match your policy’s chunk_size (default 50 for π₀). relative_exclude_joints lists joint names that should remain in absolute space (e.g. gripper commands). Use --push_to_hub true to upload the updated stats to the Hub.
Then train with relative actions enabled:
lerobot-train \
--dataset.repo_id=your_dataset \
--policy.type=pi0 \
--policy.use_relative_actions=true \
--policy.relative_exclude_joints='["gripper"]' \
...License
This model follows the Apache 2.0 License, consistent with the original OpenPI repository.
Update on GitHub