Revisiting Generalization Across Difficulty Levels: It's Not So Easy Paper • 2511.21692 • Published Nov 26 • 15
Planetarium: A Rigorous Benchmark for Translating Text to Structured Planning Languages Paper • 2407.03321 • Published Jul 3, 2024 • 20