Benchmark Runner
This script benchmarks forward/backward performance of several operations (rms, add_rms, poly, mul_poly).
Results can be saved as CSV files or plots.
Note
To run the benchmarks, you must select the appropriate Torch version along with the corresponding CUDA/ROCm build from within thebuilddirectory.Example:
export PYTHONPATH=$PYTHONPATH:<YOUR_PATH>/activation/build/torch27-cxx11-cu128-x86_64-linux
Usage
python main.py --case <CASE> [--plot] [--save-path <DIR>]
--case(required): one ofrms,add_rms,poly,mul_poly--plot: save plots instead of CSVs--save-path: output directory (default:./configs/)
Examples
python main.py --case add_rms --save-path ./results/
python main.py --case poly --plot --save-path ./plots/
Output
- CSV:
<case>-fwd-perf.csv,<case>-bwd-perf.csv - Plots:
plot_<case>-fwd-perf.png,plot_<case>-bwd-perf.png