Microsoft Research

company

https://www.microsoft.com/en-us/research/

AI & ML interests

None defined yet.

Recent Activity

unilm submitted a paper about 19 hours ago

Online Experiential Learning for Language Models

beanie00 submitted a paper 1 day ago

Understanding Reasoning in LLMs through Strategic Information Allocation under Uncertainty

ZongqianLi submitted a paper 8 days ago

Scaling Data Difficulty: Improving Coding Models via Reinforcement Learning on Fresh and Challenging Problems

View all activity

Papers

Online Experiential Learning for Language Models

Understanding Reasoning in LLMs through Strategic Information Allocation under Uncertainty

View all Papers

submitted a paper to Daily Papers about 19 hours ago

Online Experiential Learning for Language Models

Paper • 2603.16856 • Published 1 day ago • 34

submitted a paper to Daily Papers 1 day ago

Understanding Reasoning in LLMs through Strategic Information Allocation under Uncertainty

Paper • 2603.15500 • Published 2 days ago • 11

submitted 2 papers to Daily Papers 8 days ago

Scaling Data Difficulty: Improving Coding Models via Reinforcement Learning on Fresh and Challenging Problems

Paper • 2603.07779 • Published 10 days ago • 5

Breaking Training Bottlenecks: Effective and Stable Reinforcement Learning for Coding Models

Paper • 2603.07777 • Published 10 days ago • 5

submitted a paper to Daily Papers 8 days ago

Sparse-BitNet: 1.58-bit LLMs are Naturally Friendly to Semi-Structured Sparsity

Paper • 2603.05168 • Published 13 days ago • 4

submitted a paper to Daily Papers 9 days ago

Scaling Agentic Capabilities, Not Context: Efficient Reinforcement Finetuning for Large Toolspaces

Paper • 2603.06713 • Published 13 days ago • 15

submitted a paper to Daily Papers 15 days ago

Learning When to Act or Refuse: Guarding Agentic Reasoning Models for Safe Multi-Step Tool Use

Paper • 2603.03205 • Published 15 days ago • 11

submitted a paper to Daily Papers about 1 month ago

Reinforcement World Model Learning for LLM-based Agents

Paper • 2602.05842 • Published Feb 5 • 27

submitted a paper to Daily Papers about 1 month ago

Efficient Autoregressive Video Diffusion with Dummy Head

Paper • 2601.20499 • Published Jan 28 • 8

submitted a paper to Daily Papers about 1 month ago

Self-Hinting Language Models Enhance Reinforcement Learning

Paper • 2602.03143 • Published Feb 3 • 30

submitted a paper to Daily Papers about 1 month ago

LIVE: Long-horizon Interactive Video World Modeling

Paper • 2602.03747 • Published Feb 3 • 12

submitted a paper to Daily Papers about 1 month ago

Training LLMs for Divide-and-Conquer Reasoning Elevates Test-Time Scalability

Paper • 2602.02477 • Published Feb 2 • 11

submitted a paper to Daily Papers about 2 months ago

Multiplex Thinking: Reasoning via Token-wise Branch-and-Merge

Paper • 2601.08808 • Published Jan 13 • 39