In a Training Loop 🔄

9 22

aayush garg PRO

garg-aayush

https://aayushgarg.dev/

AI & ML interests

None yet

Recent Activity

liked a model 15 days ago

Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled

published an article 18 days ago

FlashAttention: Making Attention I/O-Aware

liked a model 28 days ago

ggml-org/GLM-OCR-GGUF

View all activity

Organizations

liked a model 15 days ago

Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled

Image-Text-to-Text • 28B • Updated 8 days ago • 585k • 2.62k

published an article 18 days ago

Article

FlashAttention: Making Attention I/O-Aware

18 days ago

liked a model 28 days ago

ggml-org/GLM-OCR-GGUF

0.9B • Updated Mar 10 • 24.1k • 50

published an article about 2 months ago

Article

GRPO: Building Intuition Through Ablation Studies

Feb 26

•

updated a model about 2 months ago

garg-aayush/cs336-grpo-exps

Updated Feb 25

published a model about 2 months ago

garg-aayush/cs336-grpo-exps

Updated Feb 25

updated a dataset 3 months ago

garg-aayush/sft-cs336-assign5-datasets

Preview • Updated Jan 26 • 310 • 6

published an article 3 months ago

Article

Expert Iteration for Math Reasoning

Jan 23

•

updated a model 3 months ago

garg-aayush/cs336_exp-iter_exps

Updated Jan 15

published a model 3 months ago

garg-aayush/cs336_exp-iter_exps

Updated Jan 15

published an article 3 months ago

Article

Understanding GRPO: PPO without the critic

Jan 1

•

upvoted an article 3 months ago

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

Feb 7, 2025

•

286

published an article 3 months ago

Article

Deriving the DPO Loss from First Principles

Dec 30, 2025

•

updated a collection 3 months ago

RLHF Papers

Collection

7 items • Updated Dec 30, 2025 • 1

published an article 4 months ago

Article

Deriving the PPO Loss from First Principles

Dec 25, 2025

•

upvoted an article 4 months ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

Dec 1, 2025

•

307

aayush garg PRO

AI & ML interests

Recent Activity

Organizations

garg-aayush's activity

FlashAttention: Making Attention I/O-Aware

GRPO: Building Intuition Through Ablation Studies

Expert Iteration for Math Reasoning

Understanding GRPO: PPO without the critic

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

Deriving the DPO Loss from First Principles

Deriving the PPO Loss from First Principles

Transformers v5: Simple model definitions powering the AI ecosystem