EurekAgent: Agent Environment Engineering is All You Need For Autonomous Scientific Discovery Paper • 2606.13662 • Published 8 days ago • 27
Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning Paper • 2606.04923 • Published 16 days ago • 39
LongTraceRL: Learning Long-Context Reasoning from Search Agent Trajectories with Rubric Rewards Paper • 2605.31584 • Published 21 days ago • 41
IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse Paper • 2603.12201 • Published Mar 12 • 60
WildReward Collection Learning Reward Models from In-the-Wild Interactions • 4 items • Updated Mar 2 • 2
WildReward Collection Learning Reward Models from In-the-Wild Interactions • 4 items • Updated Mar 2 • 2
WildReward: Learning Reward Models from In-the-Wild Human Interactions Paper • 2602.08829 • Published Feb 9 • 3
WildReward: Learning Reward Models from In-the-Wild Human Interactions Paper • 2602.08829 • Published Feb 9 • 3
WildReward Collection Learning Reward Models from In-the-Wild Interactions • 4 items • Updated Mar 2 • 2
WildReward Collection Learning Reward Models from In-the-Wild Interactions • 4 items • Updated Mar 2 • 2