ReLook: Vision-Grounded RL with a Multimodal LLM Critic for Agentic Web Coding Paper • 2510.11498 • Published 21 days ago • 10
AutoCodeBench: Large Language Models are Automatic Code Benchmark Generators Paper • 2508.09101 • Published Aug 12 • 8
ArtifactsBench: Bridging the Visual-Interactive Gap in LLM Code Generation Evaluation Paper • 2507.04952 • Published Jul 7 • 9
NextCoder Collection NextCoder family of code-editing LMs developed with Selective Knowledge Transfer and its training data. • 6 items • Updated Jul 9 • 71