Muller's picture

21 5

Muller

James177

·

AI & ML interests

None yet

Recent Activity

liked a model about 2 months ago

LiquidAI/LFM2-Audio-1.5B

upvoted a paper about 2 months ago

Chasing the Tail: Effective Rubric-based Reward Modeling for Large Language Model Post-Training

upvoted a paper about 2 months ago

PIPer: On-Device Environment Setup via Online Reinforcement Learning

View all activity

Organizations

None yet

upvoted 4 papers about 2 months ago

Chasing the Tail: Effective Rubric-based Reward Modeling for Large Language Model Post-Training

Paper • 2509.21500 • Published Sep 25 • 18

PIPer: On-Device Environment Setup via Online Reinforcement Learning

Paper • 2509.25455 • Published Sep 29 • 37

Lavida-O: Elastic Large Masked Diffusion Models for Unified Multimodal Understanding and Generation

Paper • 2509.19244 • Published Sep 23 • 11

ExGRPO: Learning to Reason from Experience

Paper • 2510.02245 • Published Oct 2 • 78

upvoted 4 papers 2 months ago

AbGen: Evaluating Large Language Models in Ablation Study Design and Evaluation for Scientific Research

Paper • 2507.13300 • Published Jul 17 • 19

LLaSO: A Foundational Framework for Reproducible Research in Large Language and Speech Model

Paper • 2508.15418 • Published Aug 21 • 8

HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning

Paper • 2509.08519 • Published Sep 10 • 127

Reasoning over Boundaries: Enhancing Specification Alignment via Test-time Delibration

Paper • 2509.14760 • Published Sep 18 • 52

upvoted 12 papers 6 months ago

R&B: Domain Regrouping and Data Mixture Balancing for Efficient Foundation Model Training

Paper • 2505.00358 • Published May 1 • 26

Hybrid 3D-4D Gaussian Splatting for Fast Dynamic Scene Representation

Paper • 2505.13215 • Published May 19 • 29

Softpick: No Attention Sink, No Massive Activations with Rectified Softmax

Paper • 2504.20966 • Published Apr 29 • 32

Through the Looking Glass: Common Sense Consistency Evaluation of Weird Images

Paper • 2505.07704 • Published May 12 • 29

RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale

Paper • 2505.03005 • Published May 5 • 36

Model Merging in Pre-training of Large Language Models

Paper • 2505.12082 • Published May 17 • 40

HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation

Paper • 2505.04512 • Published May 7 • 36

Agentic Reasoning and Tool Integration for LLMs via Reinforcement Learning

Paper • 2505.01441 • Published Apr 28 • 39

Unified Continuous Generative Models

Paper • 2505.07447 • Published May 12 • 43

OpenThinkIMG: Learning to Think with Images via Visual Tool Reinforcement Learning

Paper • 2505.08617 • Published May 13 • 41

Learning from Peers in Reasoning Models

Paper • 2505.07787 • Published May 12 • 45

Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models

Paper • 2505.14810 • Published May 20 • 62