Omni-Reward: Towards Generalist Omni-Modal Reward Modeling with Free-Form Preferences Paper • 2510.23451 • Published 3 days ago • 26
Glyph: Scaling Context Windows via Visual-Text Compression Paper • 2510.17800 • Published 10 days ago • 62
Boundary-Guided Policy Optimization for Memory-efficient RL of Diffusion Large Language Models Paper • 2510.11683 • Published 17 days ago • 12
LLaDA-8B-BGPO Collection Boundary-Guided Policy Optimization for Memory-Efficient RL of Diffusion Large Language Models • 4 items • Updated 20 days ago • 4
DeepPrune: Parallel Scaling without Inter-trace Redundancy Paper • 2510.08483 • Published 21 days ago • 23
StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets? Paper • 2510.02209 • Published 28 days ago • 51
SIRI: Scaling Iterative Reinforcement Learning with Interleaved Compression Paper • 2509.25176 • Published Sep 29 • 12
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models Paper • 2508.06471 • Published Aug 8 • 188
GLM-4.5 Collection GLM-4.5: An open-source large language model designed for intelligent agents by Z.ai • 11 items • Updated Aug 11 • 245
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper • 2507.01006 • Published Jul 1 • 236
LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning Paper • 2506.18841 • Published Jun 23 • 56
Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis Paper • 2506.04142 • Published Jun 4 • 27
MMR-V: What's Left Unsaid? A Benchmark for Multimodal Deep Reasoning in Videos Paper • 2506.04141 • Published Jun 4 • 29
SuperWriter: Reflection-Driven Long-Form Generation with Large Language Models Paper • 2506.04180 • Published Jun 4 • 33
Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles Paper • 2505.19914 • Published May 26 • 43
Hard Negative Contrastive Learning for Fine-Grained Geometric Understanding in Large Multimodal Models Paper • 2505.20152 • Published May 26 • 11
QwenLong-CPRS: Towards infty-LLMs with Dynamic Context Optimization Paper • 2505.18092 • Published May 23 • 43
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? Paper • 2504.13837 • Published Apr 18 • 135