# Benchmark Runner This script benchmarks **forward/backward performance** of several operations (`rms`, `add_rms`, `poly`, `mul_poly`). Results can be saved as **CSV files** or **plots**. > **Note**
> To run the benchmarks, you must select the appropriate Torch version along with the corresponding CUDA/ROCm build from within the `build` directory. > > **Example:** > > ```bash > export PYTHONPATH=$PYTHONPATH:/activation/build/torch27-cxx11-cu128-x86_64-linux > ``` ## Usage ```bash python main.py --case [--plot] [--save-path ] ``` - `--case` (required): one of `rms`, `add_rms`, `poly`, `mul_poly` - `--plot`: save plots instead of CSVs - `--save-path`: output directory (default: `./configs/`) ## Examples ```bash python main.py --case add_rms --save-path ./results/ python main.py --case poly --plot --save-path ./plots/ ``` ## Output - CSV: `-fwd-perf.csv`, `-bwd-perf.csv` - Plots: `plot_-fwd-perf.png`, `plot_-bwd-perf.png`