--- title: RlveGym Environment Server emoji: 📡 colorFrom: purple colorTo: blue sdk: docker pinned: false app_port: 8000 base_path: /web tags: - openenv --- # RlveGym Environment This package contains a collection of 400 verifiable environments from RLVE-Gym, introduced by the paper [*RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments*](https://arxiv.org/abs/2511.07317) (original GitHub repository is [here](https://github.com/Zhiyuan-Zeng/RLVE)). ## Quick Start The simplest way to use RlveGym environment is through the `RlveGymEnv` class: ```python from RLVE_Gym import RlveGymAction, RlveGymEnv try: # Create environment from Docker image RLVE_Gymenv = RlveGymEnv.from_docker_image("RLVE_Gym-env:latest") # If you prefer not to build the Docker image locally, you can try: RLVE_Gymenv = RlveGymEnv.from_docker_image("registry.hf.space/zhiyuanzeng-rlve-gym:latest") # Reset result = RLVE_Gymenv.reset() print(f"Problem Prompt: {result.observation.problem_input}") # Or: print(f"Problem Prompt (from the environment's state): {RLVE_Gymenv.state().problem_input}") # Send multiple outputs outputs = [ "Wrong Format", r"0", # Wrong Answer r"4753", # Please replace "4753" with the correct Answer ] for output in outputs: result = RLVE_Gymenv.step(RlveGymAction(output = output)) print(f"Sent: '{output}'") print(f"Result: `{result}`") print(f"`verifier_result`: `{result.observation.verifier_result}`") print(f"`reward`: `{result.reward}`") print("`accuracy`: `{}`".format(result.observation.verifier_result["accuracy"])) print("(so far) sum_accuracy/num_samples = {}/{}".format(RLVE_Gymenv.state().sum_accuracy, RLVE_Gymenv.state().num_samples)) print("\n") finally: # Always clean up RLVE_Gymenv.close() ``` That's it! The `RlveGymEnv.from_docker_image()` method handles: - Starting the Docker container - Waiting for the server to be ready - Connecting to the environment - Container cleanup when you call `close()` ## Building the Docker Image Before using the environment, you need to build the Docker image: ```bash # From project root docker build -t RLVE_Gym-env:latest -f server/Dockerfile . ``` ## Deploying to Hugging Face Spaces You can easily deploy your OpenEnv environment to Hugging Face Spaces using the `openenv push` command: ```bash # From the environment directory (where openenv.yaml is located) openenv push # Or specify options openenv push --namespace my-org --private ``` The `openenv push` command will: 1. Validate that the directory is an OpenEnv environment (checks for `openenv.yaml`) 2. Prepare a custom build for Hugging Face Docker space (enables web interface) 3. Upload to Hugging Face (ensuring you're logged in) ### Prerequisites - Authenticate with Hugging Face: The command will prompt for login if not already authenticated ### Options - `--directory`, `-d`: Directory containing the OpenEnv environment (defaults to current directory) - `--repo-id`, `-r`: Repository ID in format 'username/repo-name' (defaults to 'username/env-name' from openenv.yaml) - `--base-image`, `-b`: Base Docker image to use (overrides Dockerfile FROM) - `--private`: Deploy the space as private (default: public) ### Examples ```bash # Push to your personal namespace (defaults to username/env-name from openenv.yaml) openenv push # Push to a specific repository openenv push --repo-id my-org/my-env # Push with a custom base image openenv push --base-image ghcr.io/meta-pytorch/openenv-base:latest # Push as a private space openenv push --private # Combine options openenv push --repo-id my-org/my-env --base-image custom-base:latest --private ``` After deployment, your space will be available at: `https://huggingface.co/spaces/` The deployed space includes: - **Web Interface** at `/web` - Interactive UI for exploring the environment - **API Documentation** at `/docs` - Full OpenAPI/Swagger interface - **Health Check** at `/health` - Container health monitoring ## Environment Details ### Environment Initialization Please check [here](server/RLVE_Gym_environment.py) for detailed usage: - `environment_identifier` (str) - The environment's identifier. Check [here](server/Gym/environments/__init__.py) for detailed usage. - `difficulty` (int) - The difficulty of generated problems. - `answer_markers` (Tuple[str] of length 2) - How the environment extracts the final answer from a model output. - `initial_seed` (int) - The initial seed to use when generating the first problem. Whenever `reset()` is called, the seed will be incremented by 1. Right now, you can set these arguments by passing them through environment variables: ```python RLVE_Gymenv = RlveGymEnv.from_docker_image( "RLVE_Gym-env:latest", env_vars = { "RLVEGYM_ENVIRONMENT_IDENTIFIER": "Sorting", "RLVEGYM_DIFFICULTY": "2", "RLVEGYM_ANSWER_MARKER_START": r"\boxed{", "RLVEGYM_ANSWER_MARKER_END": r"}", "RLVEGYM_INITIAL_SEED": "10", }, ) ``` ### Action **RlveGymAction**: Contains a single field - `output` (str) - The model's output to get verified. ### State **RlveGymState**: - `seed` (int) - The seed to use when running `reset()`. - `problem_input` (Optional[str]) - The input of the problem; if it is `None`, it means that the problem generation has not been run, or it failed. - `num_samples` (int) and `sum_accuracy` (int) - The statistics of the result of `step(action)` so far for the current problem (the number of outputs sent to the verifier and the number of correct ones). ### Observation **RlveGymObservation**: - `problem_input` (Optional[str]) - The input of the problem; if it is `None`, it means that the problem generation has not been run or has failed. - `verifier_result` (Optional[dict]) - Contains `reward` as the raw reward, `accuracy` as the 0/1 correctness, and `format_score` as the 0/1 format correctness; if it is `None`, it means that the verification has failed. - `success` (bool) - `True` or `False` indicates whether the operation succeeded. - `message` (str) - The explanation of `success`. - `reward` (Optional[float]) - The value is `verifier_result["reward"]` when `verifier_result` is not `None` (otherwise, `reward` is also `None`). ## Advanced Usage ### Connecting to an Existing Server If you already have an RlveGymEnv server running, you can connect directly: ```python from RLVE_Gym import RlveGymEnv # Connect to existing server RLVE_Gymenv = RlveGymEnv(base_url="") # Use as normal result = RLVE_Gymenv.reset() result = RLVE_Gymenv.step(RlveGymAction(output="Hello!")) ``` Note: When connecting to an existing server, `RLVE_Gymenv.close()` will NOT stop the server. ## Development & Testing ### Direct Environment Testing Test the environment logic directly without starting the HTTP server: ```bash # From the server directory python3 server/RLVE_Gym_environment.py ``` This verifies that: - Environment resets correctly - Step executes actions properly - State tracking works - Rewards are calculated correctly ### Running Locally Run the server locally for development: ```bash uvicorn server.app:app --reload ```