license: mit
The model checkpoint of ARPO:
Arxiv: https://arxiv.org/abs/2507.19849
HF paper: https://huggingface.co/papers/2507.19849
Github: https://github.com/dongguanting/ARPO