Qihan Ren's picture

Qihan Ren

jasonrqh

·

https://nebularaid2000.github.io/

AI & ML interests

XAI, LLM reasoning & safety, Coding agent

Recent Activity

upvoted a paper about 14 hours ago

The Verification Horizon: No Silver Bullet for Coding Agent Rewards

upvoted a paper 8 days ago

FreeStyle: Free Control of Style-Content Dual-Reference Generation from Community LoRA Mining

liked a model 15 days ago

MiniMaxAI/MiniMax-M3

View all activity

Organizations

Collections 1

Papers 13

arxiv:2605.29801

arxiv:2605.26494

arxiv:2604.06628

arxiv:2604.02022

models 33

jasonrqh/InternLM2.5-20B_DeepSeek-R1-20k_lr5e-5_ep8_bs256

Text Generation • Updated Apr 11

jasonrqh/InternLM2.5-20B_Numina-Math-20k_lr5e-5_ep8_bs256

Text Generation • Updated Apr 11

jasonrqh/InternLM2.5-20B_Math-NoCoT-20k_lr5e-5_ep8_bs256

Text Generation • Updated Apr 11

jasonrqh/InternLM2.5-20B_Countdown-CoT-20k_lr5e-5_ep8_bs256

Text Generation • Updated Apr 11

jasonrqh/InternLM2.5-20B_Math-CoT-20k_lr5e-5_ep8_bs256

Text Generation • Updated Apr 11

jasonrqh/Qwen2.5-14B_Math-CoT-20k_lr5e-5_ep8_bs256

Text Generation • Updated Apr 11

jasonrqh/Qwen2.5-7B_Math-CoT-20k_lr5e-5_ep8_bs256

Text Generation • Updated Apr 11

jasonrqh/Qwen2.5-3B_Math-CoT-20k_lr5e-5_ep8_bs256

Text Generation • Updated Apr 11

jasonrqh/Qwen2.5-1.5B_Math-CoT-20k_lr5e-5_ep8_bs256

Text Generation • Updated Apr 11

jasonrqh/Qwen3-14B_Math-CoT-2.5k_lr5e-5_ep8_bs32

Text Generation • Updated Apr 11

datasets 6

jasonrqh/Math-CoT-44k-Qwen3-32b-n32-16384-with-logprob-and-entropy

Viewer • Updated Apr 11 • 44.4k • 1.63k • 1

jasonrqh/DeepSeek-R1-20k

Viewer • Updated Apr 11 • 20.5k • 56

jasonrqh/NuminaMath-20k

Viewer • Updated Apr 11 • 20.5k • 58 • 1

jasonrqh/Countdown-CoT-20k

Viewer • Updated Apr 11 • 20.5k • 53

jasonrqh/Math-NoCoT-20k

Viewer • Updated Apr 11 • 20.5k • 31

jasonrqh/Math-CoT-20k

Viewer • Updated Apr 11 • 20.5k • 74 • 6