Model Details
Model Description
This repository hosts the Booster Soccer Controller Suite — a collection of reinforcement learning policies and controllers powering humanoid agents in the Booster Soccer Showdown.
It contains:
- Low-Level Controller (robot/):
A proprioceptive policy for the Lower T1 humanoid that converts high-level commands (forward, lateral, and yaw velocities) into joint angle targets. - Competition Policies (model/):
High-level agents trained in SAI’s soccer environments that output those high-level commands for match-time play.
- Developed by: ArenaX Labs
- License: MIT
- Frameworks: PyTorch · MuJoCo · Stable-Baselines3
- Environments: Booster Gym / SAI Soccer tasks
Testing Instructions
- Clone the repo
git clone https://github.com/ArenaX-Labs/booster_soccer_showdown.git
cd booster_soccer_showdown
- Create & activate a Python 3.10+ environment
# any env manager is fine; here are a few options
# --- venv ---
python3 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# --- conda ---
# conda create -n booster-ssl python=3.11 -y && conda activate booster-ssl
- Install dependencies
pip install -r requirements.txt
Teleoperation
Booster Soccer Showdown supports keyboard teleop out of the box.
python booster_control/teleoperate.py \
--env LowerT1GoaliePenaltyKick-v0
Default bindings (example):
W/S: move forward/backwardA/D: move left/rightQ/E: rotate left/rightL: reset commandsP: reset environment
⚠️ Note for macOS and Windows users Because different renderers are used on macOS and Windows, you may need to adjust the position and rotation sensitivity for smooth teleoperation. Run the following command with the sensitivity flags set explicitly:
python booster_control/teleoperate.py \
--env LowerT1GoaliePenaltyKick-v0 \
--pos_sensitivity 1.5 \
--rot_sensitivity 1.5
(Tune --pos_sensitivity and --rot_sensitivity as needed for your setup.)
Training
We provide a minimal reinforcement learning pipeline for training agents with Deep Deterministic Policy Gradient (DDPG) in the Booster Soccer Showdown environments in the training_scripts/ folder. The training stack consists of three scripts:
1) ddpg.py
Defines the DDPG_FF model, including:
- Actor and Critic neural networks with configurable hidden layers and activation functions.
- Target networks and soft-update mechanism for stability.
- Training step implementation (critic loss with MSE, actor loss with policy gradient).
- Utility functions for forward passes, action selection, and backpropagation.
2) training.py
Provides the training loop and supporting components:
ReplayBuffer for experience storage and sampling.
Exploration noise injection to encourage policy exploration.
Iterative training loop that:
- Interacts with the environment.
- Stores experiences.
- Periodically samples minibatches to update actor/critic networks.
Tracks and logs progress (episode rewards, critic/actor loss) with
tqdm.
3) main.py
Serves as the entry point to run training:
Initializes the Booster Soccer Showdown environment via the SAI client.
Defines a Preprocessor to normalize and concatenate robot state, ball state, and environment info into a training-ready observation vector.
Instantiates a DDPG_FF model with custom architecture.
Defines an action function that rescales raw policy outputs to environment-specific action bounds.
Calls the training loop, and after training, supports:
sai.watch(...)for visualizing learned behavior.sai.benchmark(...)for local benchmarking.
Example: Run Training
python training_scripts/main.py
This will:
- Build the environment.
- Initialize the model.
- Run the training loop with replay buffer and DDPG updates.
- Launch visualization and benchmarking after training.
Example: Test pretrained model
python training_scripts/test.py --env LowerT1KickToTarget-v0