1 22 60

Mann Patel

manncodes

AI & ML interests

NLP, Mech Interp, Reasoning, MLSystems

Recent Activity

liked a Space 3 days ago

HuggingFaceH4/on-policy-distillation

liked a model 7 days ago

MiniMaxAI/MiniMax-M2

liked a dataset 11 days ago

HuggingFaceH4/ultrafeedback_binarized

View all activity

Organizations

None yet

upvoted a collection 22 days ago

Apertus LLM

Collection

Democratizing Open and Compliant LLMs for Global Language Environments: 8B and 70B open-data open-weights models, multilingual in >1000 languages • 4 items • Updated Oct 1 • 293

upvoted a paper 28 days ago

Apriel-1.5-15b-Thinker

Paper • 2510.01141 • Published Oct 1 • 113

upvoted an article 30 days ago

Article

PipelineRL

and 3 others •

Apr 25

• 38

upvoted a collection about 2 months ago

— Long-context post-training 🧶 —

Collection

Resources for post-training LLMs with long-context samples • 5 items • Updated Sep 14 • 5

upvoted a paper 2 months ago

Qwen2.5-1M Technical Report

Paper • 2501.15383 • Published Jan 26 • 72

upvoted 3 papers 3 months ago

upvoted an article 4 months ago

Article

Everything About Long Context Fine-tuning

•

May 10, 2024

• 51

upvoted a paper 4 months ago

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 298

upvoted 2 articles 5 months ago

Article

Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval

Mar 22, 2024

• 104

Article

KV Cache from scratch in nanoVLM

Jun 4

• 98

upvoted a paper 5 months ago

AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning

Paper • 2505.16400 • Published May 22 • 34

upvoted 3 articles 7 months ago

Article

🪆 Introduction to Matryoshka Embedding Models

Feb 23, 2024

• 178

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

Mar 12

• 468

Article

Distributed Training with JAX and Flax NNX: A Practical Guide to Sharding

•

Mar 26

• 9

upvoted a paper 8 months ago

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8 • 285

upvoted a paper 11 months ago

LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding

Paper • 2404.16710 • Published Apr 25, 2024 • 80

upvoted a paper about 1 year ago

Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published Sep 18, 2024 • 150

upvoted a collection almost 2 years ago

Model Merging

Collection

Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12, 2024 • 247

Mann Patel

AI & ML interests

Recent Activity

Organizations

manncodes's activity

PipelineRL

Everything About Long Context Fine-tuning

Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval

KV Cache from scratch in nanoVLM

🪆 Introduction to Matryoshka Embedding Models

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

Distributed Training with JAX and Flax NNX: A Practical Guide to Sharding