Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR Paper โข 2509.02522 โข Published Sep 2 โข 25 โข 4