14 103 10

Chengsong Huang

ChengsongHuang

https://chengsong-huang.github.io/

hcscctv

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1

upvoted a paper 3 days ago

AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders

updated a dataset 3 days ago

HINT-lab/sft_Qwen_Qwen3-1.7B

View all activity

Organizations

upvoted a paper 2 days ago

Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1

Paper • 2510.19600 • Published 5 days ago • 64

upvoted a paper 3 days ago

AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders

Paper • 2510.19779 • Published 5 days ago • 57

upvoted a paper 6 days ago

A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning

Paper • 2510.15444 • Published 10 days ago • 137

upvoted a paper 12 days ago

Verbalized Sampling: How to Mitigate Mode Collapse and Unlock LLM Diversity

Paper • 2510.01171 • Published 26 days ago • 17

upvoted a paper 13 days ago

Demystifying Reinforcement Learning in Agentic Reasoning

Paper • 2510.11701 • Published 14 days ago • 31

upvoted 2 papers 14 days ago

Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels

Paper • 2510.06499 • Published 19 days ago • 31

ReviewerToo: Should AI Join The Program Committee? A Look At The Future of Peer Review

Paper • 2510.08867 • Published 17 days ago • 4

upvoted a paper 17 days ago

Agent Learning via Early Experience

Paper • 2510.08558 • Published 18 days ago • 241

upvoted 5 papers 24 days ago

Self-Forcing++: Towards Minute-Scale High-Quality Video Generation

Paper • 2510.02283 • Published 25 days ago • 91

Group-Relative REINFORCE Is Secretly an Off-Policy Algorithm: Demystifying Some Myths About GRPO and Its Friends

Paper • 2509.24203 • Published 28 days ago • 7

RLP: Reinforcement as a Pretraining Objective

Paper • 2510.01265 • Published about 1 month ago • 39

VOGUE: Guiding Exploration with Visual Uncertainty Improves Multimodal Reasoning

Paper • 2510.01444 • Published 25 days ago • 19

CLUE: Non-parametric Verification from Experience via Hidden-State Clustering

Paper • 2510.01591 • Published 25 days ago • 26

upvoted a paper 26 days ago

TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning

Paper • 2509.25760 • Published 27 days ago • 52

upvoted 4 papers 27 days ago

upvoted 2 papers 28 days ago

Language Models Can Learn from Verbal Feedback Without Scalar Rewards

Paper • 2509.22638 • Published about 1 month ago • 67

Variational Reasoning for Language Models

Paper • 2509.22637 • Published about 1 month ago • 68

Chengsong Huang

AI & ML interests

Recent Activity

Organizations

ChengsongHuang's activity