Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2601.19897

Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models

Paper • 2602.12036 • Published 22 days ago • 93
Reinforcement Learning for Self-Improving Agent with Skill Library

Paper • 2512.17102 • Published Dec 18, 2025 • 36
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation

Paper • 2512.23705 • Published Dec 29, 2025 • 45
Schoenfeld's Anatomy of Mathematical Reasoning by Language Models

Paper • 2512.19995 • Published Dec 23, 2025 • 16

2026-02-01 Papers

Scaling Embeddings Outperforms Scaling Experts in Language Models

Paper • 2601.21204 • Published Jan 29 • 100
Innovator-VL: A Multimodal Large Language Model for Scientific Discovery

Paper • 2601.19325 • Published Jan 27 • 79
TwinBrainVLA: Unleashing the Potential of Generalist VLMs for Embodied Tasks via Asymmetric Mixture-of-Transformers

Paper • 2601.14133 • Published Jan 20 • 61
MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods

Paper • 2601.21821 • Published Jan 29 • 60

Self-Distillation Enables Continual Learning

Paper • 2601.19897 • Published Jan 27 • 26

Continual Learning

Self-Distillation Enables Continual Learning

Paper • 2601.19897 • Published Jan 27 • 26

[papers] Distillation

Which Reasoning Trajectories Teach Students to Reason Better? A Simple Metric of Informative Alignment

Paper • 2601.14249 • Published Jan 20 • 13
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models

Paper • 2402.07033 • Published Feb 10, 2024 • 19
MeepleLM: A Virtual Playtester Simulating Diverse Subjective Experiences

Paper • 2601.07251 • Published Jan 12 • 11
GameTalk: Training LLMs for Strategic Conversation

Paper • 2601.16276 • Published Jan 22 • 13

THINKSAFE: Self-Generated Safety Alignment for Reasoning Models

Paper • 2601.23143 • Published Jan 30 • 38
PaperBanana: Automating Academic Illustration for AI Scientists

Paper • 2601.23265 • Published Jan 30 • 211
Agentic Reasoning for Large Language Models

Paper • 2601.12538 • Published Jan 18 • 200
BabyVision: Visual Reasoning Beyond Language

Paper • 2601.06521 • Published Jan 10 • 197

about 1 month ago

Self-Distillation Enables Continual Learning

Paper • 2601.19897 • Published Jan 27 • 26
MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning

Paper • 2601.21468 • Published Jan 29 • 25
Look Back to Reason Forward: Revisitable Memory for Long-Context LLM Agents

Paper • 2509.23040 • Published Sep 27, 2025 • 12

Continual Learning

Self-Distillation Enables Continual Learning

Paper • 2601.19897 • Published Jan 27 • 26

Self-Distillation

Reinforcement Learning via Self-Distillation

Paper • 2601.20802 • Published Jan 28 • 40
Self-Distillation Enables Continual Learning

Paper • 2601.19897 • Published Jan 27 • 26

Agent-finetuning-RAM-METHOD

Behavior Knowledge Merge in Reinforced Agentic Models

Paper • 2601.13572 • Published Jan 20 • 25
Language of Thought Shapes Output Diversity in Large Language Models

Paper • 2601.11227 • Published Jan 16 • 9
Agentic-R: Learning to Retrieve for Agentic Search

Paper • 2601.11888 • Published Jan 17 • 19
RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System

Paper • 2602.02488 • Published Feb 2 • 33

Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models

Paper • 2602.12036 • Published 22 days ago • 93
Reinforcement Learning for Self-Improving Agent with Skill Library

Paper • 2512.17102 • Published Dec 18, 2025 • 36
Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation

Paper • 2512.23705 • Published Dec 29, 2025 • 45
Schoenfeld's Anatomy of Mathematical Reasoning by Language Models

Paper • 2512.19995 • Published Dec 23, 2025 • 16

THINKSAFE: Self-Generated Safety Alignment for Reasoning Models

Paper • 2601.23143 • Published Jan 30 • 38
PaperBanana: Automating Academic Illustration for AI Scientists

Paper • 2601.23265 • Published Jan 30 • 211
Agentic Reasoning for Large Language Models

Paper • 2601.12538 • Published Jan 18 • 200
BabyVision: Visual Reasoning Beyond Language

Paper • 2601.06521 • Published Jan 10 • 197

2026-02-01 Papers

Scaling Embeddings Outperforms Scaling Experts in Language Models

Paper • 2601.21204 • Published Jan 29 • 100
Innovator-VL: A Multimodal Large Language Model for Scientific Discovery

Paper • 2601.19325 • Published Jan 27 • 79
TwinBrainVLA: Unleashing the Potential of Generalist VLMs for Embodied Tasks via Asymmetric Mixture-of-Transformers

Paper • 2601.14133 • Published Jan 20 • 61
MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods

Paper • 2601.21821 • Published Jan 29 • 60

about 1 month ago

Self-Distillation Enables Continual Learning

Paper • 2601.19897 • Published Jan 27 • 26
MemOCR: Layout-Aware Visual Memory for Efficient Long-Horizon Reasoning

Paper • 2601.21468 • Published Jan 29 • 25
Look Back to Reason Forward: Revisitable Memory for Long-Context LLM Agents

Paper • 2509.23040 • Published Sep 27, 2025 • 12

Self-Distillation Enables Continual Learning

Paper • 2601.19897 • Published Jan 27 • 26

Continual Learning

Self-Distillation Enables Continual Learning

Paper • 2601.19897 • Published Jan 27 • 26

Continual Learning

Self-Distillation Enables Continual Learning

Paper • 2601.19897 • Published Jan 27 • 26

Self-Distillation

Reinforcement Learning via Self-Distillation

Paper • 2601.20802 • Published Jan 28 • 40
Self-Distillation Enables Continual Learning

Paper • 2601.19897 • Published Jan 27 • 26

[papers] Distillation

Which Reasoning Trajectories Teach Students to Reason Better? A Simple Metric of Informative Alignment

Paper • 2601.14249 • Published Jan 20 • 13
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models

Paper • 2402.07033 • Published Feb 10, 2024 • 19
MeepleLM: A Virtual Playtester Simulating Diverse Subjective Experiences

Paper • 2601.07251 • Published Jan 12 • 11
GameTalk: Training LLMs for Strategic Conversation

Paper • 2601.16276 • Published Jan 22 • 13

Agent-finetuning-RAM-METHOD

Behavior Knowledge Merge in Reinforced Agentic Models

Paper • 2601.13572 • Published Jan 20 • 25
Language of Thought Shapes Output Diversity in Large Language Models

Paper • 2601.11227 • Published Jan 16 • 9
Agentic-R: Learning to Retrieve for Agentic Search

Paper • 2601.11888 • Published Jan 17 • 19
RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System

Paper • 2602.02488 • Published Feb 2 • 33

Previous
1
2
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs