COLLECTION
updated
AgentConductor: Topology Evolution for Multi-Agent Competition-Level Code Generation
Paper
• 2602.17100
• Published • 3
GroupGPT: A Token-efficient and Privacy-preserving Agentic Framework for Multi-User Chat Assistant
Paper
• 2603.01059
• Published • 1
Multi-Domain Riemannian Graph Gluing for Building Graph Foundation Models
Paper
• 2603.00618
• Published
Heterogeneous Agent Collaborative Reinforcement Learning
Paper
• 2603.02604
• Published • 186
Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory
Paper
• 2603.04257
• Published • 19
InfinityStory: Unlimited Video Generation with World Consistency and Character-Aware Shot Transitions
Paper
• 2603.03646
• Published • 8
TAROT: Task-Oriented Authorship Obfuscation Using Policy Optimization
Methods
Paper
• 2407.21630
• Published • 8
SageBwd: A Trainable Low-bit Attention
Paper
• 2603.02170
• Published • 17
Experiential Reinforcement Learning
Paper
• 2602.13949
• Published • 71
On-Policy Self-Distillation for Reasoning Compression
Paper
• 2603.05433
• Published • 6
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger
Paper
• 2602.08222
• Published • 283
Less is Enough: Synthesizing Diverse Data in Feature Space of LLMs
Paper
• 2602.10388
• Published • 243
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning
Paper
• 2511.16043
• Published • 110
LiteAttention: A Temporal Sparse Attention for Diffusion Transformers
Paper
• 2511.11062
• Published • 32
KLASS: KL-Guided Fast Inference in Masked Diffusion Models
Paper
• 2511.05664
• Published • 37
Evolve the Method, Not the Prompts: Evolutionary Synthesis of Jailbreak Attacks on LLMs
Paper
• 2511.12710
• Published • 39
Paper
• 2511.11238
• Published • 38
Distribution-Conditioned Transport
Paper
• 2603.04736
• Published • 3
Truncated Step-Level Sampling with Process Rewards for Retrieval-Augmented Reasoning
Paper
• 2602.23440
• Published • 3
Latent Particle World Models: Self-supervised Object-centric Stochastic Dynamics Modeling
Paper
• 2603.04553
• Published • 3
Planning in 8 Tokens: A Compact Discrete Tokenizer for Latent World Model
Paper
• 2603.05438
• Published • 39
Dynamic Chunking Diffusion Transformer
Paper
• 2603.06351
• Published • 14
Beyond the Grid: Layout-Informed Multi-Vector Retrieval with Parsed Visual Document Representations
Paper
• 2603.01666
• Published • 1
FlashPrefill: Instantaneous Pattern Discovery and Thresholding for Ultra-Fast Long-Context Prefilling
Paper
• 2603.06199
• Published • 9
π-StepNFT: Wider Space Needs Finer Steps in Online RL for Flow-based VLAs
Paper
• 2603.02083
• Published • 9
EmbodiedSplat: Online Feed-Forward Semantic 3DGS for Open-Vocabulary 3D Scene Understanding
Paper
• 2603.04254
• Published • 1
LongVideo-R1: Smart Navigation for Low-cost Long Video Understanding
Paper
• 2602.20913
• Published • 11
Efficient Continual Learning in Language Models via Thalamically Routed Cortical Columns
Paper
• 2602.22479
• Published
VecGlypher: Unified Vector Glyph Generation with Language Models
Paper
• 2602.21461
• Published • 12
Vectorizing the Trie: Efficient Constrained Decoding for LLM-based Generative Retrieval on Accelerators
Paper
• 2602.22647
• Published • 4
Learning from Trials and Errors: Reflective Test-Time Planning for Embodied LLMs
Paper
• 2602.21198
• Published • 4
Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs
Paper
• 2603.09906
• Published • 71
InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing
Paper
• 2603.09877
• Published • 47
Towards a Neural Debugger for Python
Paper
• 2603.09951
• Published • 5
TALON: Test-time Adaptive Learning for On-the-Fly Category Discovery
Paper
• 2603.08075
• Published
ReflexiCoder: Teaching Large Language Models to Self-Reflect on Generated Code and Self-Correct It via Reinforcement Learning
Paper
• 2603.05863
• Published • 5
Beyond Test-Time Training: Learning to Reason via Hardware-Efficient Optimal Control
Paper
• 2603.09221
• Published
Multi-Head Low-Rank Attention
Paper
• 2603.02188
• Published • 3
Compiler-First State Space Duality and Portable O(1) Autoregressive Caching for Inference
Paper
• 2603.09555
• Published • 1
Flash-KMeans: Fast and Memory-Efficient Exact K-Means
Paper
• 2603.09229
• Published • 79
ReMix: Reinforcement routing for mixtures of LoRAs in LLM finetuning
Paper
• 2603.10160
• Published • 25
Prism-Δ: Differential Subspace Steering for Prompt Highlighting in Large Language Models
Paper
• 2603.10705
• Published • 11
OpenClaw-RL: Train Any Agent Simply by Talking
Paper
• 2603.10165
• Published • 132
Causal Concept Graphs in LLM Latent Space for Stepwise Reasoning
Paper
• 2603.10377
• Published • 3
IndexCache: Accelerating Sparse Attention via Cross-Layer Index Reuse
Paper
• 2603.12201
• Published • 51
Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections
Paper
• 2603.12180
• Published • 62
Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training
Paper
• 2603.12255
• Published • 90
CREATE: Testing LLMs for Associative Creativity
Paper
• 2603.09970
• Published • 14
Geometric Autoencoder for Diffusion Models
Paper
• 2603.10365
• Published • 6
Training Language Models via Neural Cellular Automata
Paper
• 2603.10055
• Published • 7
Neural Thickets: Diverse Task Experts Are Dense Around Pretrained Weights
Paper
• 2603.12228
• Published • 11
LifeGPT: Topology-Agnostic Generative Pretrained Transformer Model for
Cellular Automata
Paper
• 2409.12182
• Published
Attention-based Neural Cellular Automata
Paper
• 2211.01233
• Published
HybridStitch: Pixel and Timestep Level Model Stitching for Diffusion Acceleration
Paper
• 2603.07815
• Published • 10
Multimodal OCR: Parse Anything from Documents
Paper
• 2603.13032
• Published • 31
LookaheadKV: Fast and Accurate KV Cache Eviction by Glimpsing into the Future without Generation
Paper
• 2603.10899
• Published • 6
From Sparse to Dense: Multi-View GRPO for Flow Models via Augmented Condition Space
Paper
• 2603.12648
• Published • 12
VQQA: An Agentic Approach for Video Evaluation and Quality Improvement
Paper
• 2603.12310
• Published • 7
BitDance: Scaling Autoregressive Generative Models with Binary Tokens
Paper
• 2602.14041
• Published • 53
ThinkRouter: Efficient Reasoning via Routing Thinking between Latent and Discrete Spaces
Paper
• 2602.11683
• Published • 8
Think Longer to Explore Deeper: Learn to Explore In-Context via Length-Incentivized Reinforcement Learning
Paper
• 2602.11748
• Published • 31
The Devil Behind Moltbook: Anthropic Safety is Always Vanishing in Self-Evolving AI Societies
Paper
• 2602.09877
• Published • 197
MetaphorStar: Image Metaphor Understanding and Reasoning with End-to-End Visual Reinforcement Learning
Paper
• 2602.10575
• Published • 4