On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification Paper • 2508.05629 • Published Aug 7 • 177
MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains Paper • 2407.18961 • Published Jul 18, 2024 • 40
In-Context Editing: Learning Knowledge from Self-Induced Distributions Paper • 2406.11194 • Published Jun 17, 2024 • 18
Panacea: Pareto Alignment via Preference Adaptation for LLMs Paper • 2402.02030 • Published Feb 3, 2024 • 10
CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making Agents Paper • 2401.10568 • Published Jan 19, 2024 • 15