Shyam Sunder Kumar's picture

Open to Work

Shyam Sunder Kumar

theainerd

·

AI & ML interests

Natural Language Processing

Recent Activity

liked a model about 22 hours ago

deepseek-ai/DeepSeek-Math-V2

upvoted an article 1 day ago

Continuous batching from first principles

liked a model 2 days ago

black-forest-labs/FLUX.2-dev

View all activity

Organizations

upvoted an article 1 day ago

Article

Continuous batching from first principles

3 days ago

•

179

upvoted a paper 16 days ago

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published 19 days ago • 120

upvoted a collection 22 days ago

Kimi-K2

Moonshot's MoE LLMs with 1 trillion parameters, exceptional on agentic intellegence • 5 items • Updated 14 days ago • 153

upvoted a collection 23 days ago

🎆 October 2025 - China Open Source Highlights

30 items • Updated 8 days ago • 13

upvoted 2 papers 25 days ago

INT v.s. FP: A Comprehensive Study of Fine-Grained Low-bit Quantization Formats

Paper • 2510.25602 • Published 30 days ago • 74

The Principles of Diffusion Models

Paper • 2510.21890 • Published Oct 24 • 58

upvoted a collection 27 days ago

Nemotron RAG

10 items • Updated 4 days ago • 42

upvoted a collection 29 days ago

gpt-oss-safeguard

gpt-oss-safeguard-120b and gpt-oss-safeguard-20b are safety reasoning models built-upon gpt-oss • 2 items • Updated 30 days ago • 58

upvoted a paper about 1 month ago

Glyph: Scaling Context Windows via Visual-Text Compression

Paper • 2510.17800 • Published Oct 20 • 67

upvoted an article about 1 month ago

Article

Building the Open Agent Ecosystem Together: Introducing OpenEnv

Oct 23

•

133

upvoted a collection about 1 month ago

⚛️ Liquid Nanos

Library of task-specific models: https://www.liquid.ai/blog/introducing-liquid-nanos-frontier-grade-performance-on-everyday-devices • 21 items • Updated 29 days ago • 94

upvoted an article about 2 months ago

Article

Smol2Operator: Post-Training GUI Agents for Computer Use

Sep 23

•

129

upvoted 2 collections about 2 months ago

GLM-4.6

7 items • Updated 23 days ago • 42

DeepSeek-V3.2

2 items • Updated about 24 hours ago • 450

upvoted an article 2 months ago

Article

Gaia2 and ARE: Empowering the community to study agents

Sep 22

•

120

upvoted 3 collections 2 months ago

MiMo-Audio

5 items • Updated 7 days ago • 23

Qwen3-Next

4 items • Updated Sep 22 • 155

📚 LLM pretraining datasets

A collection of datasets for LLM pretraining • 9 items • Updated May 5 • 14

upvoted an article 3 months ago

Article

Welcome EmbeddingGemma, Google's new efficient embedding model

Sep 4

•

261

upvoted a collection 3 months ago

DeepSeek-V3.1

4 items • Updated about 24 hours ago • 247