4 22 33

Ning Ding

stingning

https://www.stingning.cn

ningding97

AI & ML interests

NLP

Recent Activity

upvoted a paper 4 days ago

P1: Mastering Physics Olympiads with Reinforcement Learning

updated a Space 28 days ago

PRIME-RL/README

liked a model 28 days ago

PRIME-RL/P1-235B-A22B

View all activity

Organizations

upvoted a paper 4 days ago

P1: Mastering Physics Olympiads with Reinforcement Learning

Paper • 2511.13612 • Published 4 days ago • 124

upvoted a paper about 2 months ago

From f(x) and g(x) to f(g(x)): LLMs Learn New Skills in RL by Composing Old Ones

Paper • 2509.25123 • Published Sep 29 • 19

upvoted 4 papers 2 months ago

HiPhO: How Far Are (M)LLMs from Humans in the Latest High School Physics Olympiad Benchmark?

Paper • 2509.07894 • Published Sep 9 • 31

upvoted 3 papers 3 months ago

Towards a Unified View of Large Language Model Post-Training

Paper • 2509.04419 • Published Sep 4 • 73

Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published Aug 21 • 256

SSRL: Self-Search Reinforcement Learning

Paper • 2508.10874 • Published Aug 14 • 95

upvoted a paper 6 months ago

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

Paper • 2505.22617 • Published May 28 • 131

upvoted a paper 7 months ago

TTRL: Test-Time Reinforcement Learning

Paper • 2504.16084 • Published Apr 22 • 120

upvoted a paper 8 months ago

Technologies on Effectiveness and Efficiency: A Survey of State Spaces Models

Paper • 2503.11224 • Published Mar 14 • 28

upvoted 3 papers 10 months ago

UltraIF: Advancing Instruction Following from the Wild

Paper • 2502.04153 • Published Feb 6 • 24

Process Reinforcement through Implicit Rewards

Paper • 2502.01456 • Published Feb 3 • 61

MedXpertQA: Benchmarking Expert-Level Medical Reasoning and Understanding

Paper • 2501.18362 • Published Jan 30 • 23

upvoted an article 11 months ago

Article

Process Reinforcement through Implicit Rewards

Jan 3

•

upvoted a paper 11 months ago

Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization

Paper • 2412.17739 • Published Dec 23, 2024 • 41

upvoted a collection 12 months ago

ImplicitPRM

Collection

4 items • Updated Dec 5, 2024 • 5

upvoted a paper 12 months ago

Free Process Rewards without Process Labels

Paper • 2412.01981 • Published Dec 2, 2024 • 34

upvoted a collection over 1 year ago

Ultra Series

Collection

UltraLM, UltraRM and UltraCM. • 8 items • Updated Aug 7 • 6

Ning Ding

AI & ML interests

Recent Activity

Organizations

stingning's activity

Process Reinforcement through Implicit Rewards