What Does It Take to Be a Good AI Research Agent? Studying the Role of Ideation Diversity Paper • 2511.15593 • Published Nov 19 • 56
Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance Paper • 2511.13254 • Published Nov 17 • 136
THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models Paper • 2504.13367 • Published Apr 17 • 26
Retrieval Head Mechanistically Explains Long-Context Factuality Paper • 2404.15574 • Published Apr 24, 2024 • 3
MAF: Multi-Aspect Feedback for Improving Reasoning in Large Language Models Paper • 2310.12426 • Published Oct 19, 2023 • 1
Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies Paper • 2308.03188 • Published Aug 6, 2023 • 2