DeepAnalyze: Agentic Large Language Models for Autonomous Data Science Paper • 2510.16872 • Published 16 days ago • 90
Search Self-play: Pushing the Frontier of Agent Capability without Supervision Paper • 2510.18821 • Published 14 days ago • 15
From Pixels to Words -- Towards Native Vision-Language Primitives at Scale Paper • 2510.14979 • Published 19 days ago • 65
R-Horizon: How Far Can Your Large Reasoning Model Really Go in Breadth and Depth? Paper • 2510.08189 • Published 26 days ago • 25
Imperceptible Jailbreaking against Large Language Models Paper • 2510.05025 • Published 29 days ago • 33
SwiReasoning: Switch-Thinking in Latent and Explicit for Pareto-Superior Reasoning LLMs Paper • 2510.05069 • Published 29 days ago • 12
Random Policy Valuation is Enough for LLM Reasoning with Verifiable Rewards Paper • 2509.24981 • Published Sep 29 • 29
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth Paper • 2509.03867 • Published Sep 4 • 209
Inverse IFEval: Can LLMs Unlearn Stubborn Training Conventions to Follow Real Instructions? Paper • 2509.04292 • Published Sep 4 • 57