Shivam Kumar's picture

46 271

Shivam Kumar

shivamkumar

·

AI & ML interests

None yet

Recent Activity

liked a model 5 days ago

Tongyi-MAI/Z-Image-Turbo

liked a Space 5 days ago

Tongyi-MAI/Z-Image-Turbo

liked a model 7 days ago

black-forest-labs/FLUX.2-dev

View all activity

Organizations

upvoted 2 papers 18 days ago

Yan: Foundational Interactive Video Generation

Paper • 2508.08601 • Published Aug 12 • 1

MIDAS: Multimodal Interactive Digital-human Synthesis via Real-time Autoregressive Video Generation

Paper • 2508.19320 • Published Aug 26 • 29

upvoted 3 collections 18 days ago

VILA: On Pre-training for Visual Language Models

10 items • Updated Sep 13 • 57

Sana

⚡️Sana: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer • 21 items • Updated Sep 13 • 97

SANA-1.5

SANA-1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer • 6 items • Updated Sep 13 • 10

upvoted a paper 18 days ago

LongLive: Real-time Interactive Long Video Generation

Paper • 2509.22622 • Published Sep 26 • 183

upvoted 4 collections 18 days ago

LongAI

Boost AI's Long ability, while keeping Efficient. Models in this collection includes LongVILA, LongVILA-R1, LongLive. • 8 items • Updated 27 days ago • 2

NVILA (HuggingFace)

HuggingFace Transformers can load us. • 5 items • Updated Sep 13 • 5

Fast-dLLM

Efficient Diffusion LLM • 4 items • Updated Oct 8 • 7

SANA-Video

🎬 SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer • 4 items • Updated 29 days ago • 5

upvoted 2 papers 22 days ago

MotionStream: Real-Time Video Generation with Interactive Motion Controls

Paper • 2511.01266 • Published 30 days ago • 27

UniAVGen: Unified Audio and Video Generation with Asymmetric Cross-Modal Interactions

Paper • 2511.03334 • Published 28 days ago • 51

upvoted a collection 22 days ago

ChronoEdit

ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation • 8 items • Updated 8 days ago • 10

upvoted a collection 27 days ago

MDGA

Make Diffusion Great Again. The resource list for Super Data Learners, Quokka, and OpenMoE 2. • 16 items • Updated 28 days ago • 8

upvoted a collection about 1 month ago

Nemotron-Personas

A collection of multilingual, region-specific synthetic persona datasets that support sovereign AI development across many countries and regions. • 3 items • Updated 8 days ago • 13

upvoted a collection 3 months ago

Indic Parler-TTS

Collection of Parler-TTS models adapted to Indian languages. • 3 items • Updated Dec 4, 2024 • 9

upvoted a paper 3 months ago

TalkingMachines: Real-Time Audio-Driven FaceTime-Style Video via Autoregressive Diffusion Models

Paper • 2506.03099 • Published Jun 3 • 19

upvoted an article 3 months ago

Article

Evaluating Audio Reasoning with Big Bench Audio

Dec 20, 2024

•

26

upvoted a collection 5 months ago

Kimi-K2

Moonshot's MoE LLMs with 1 trillion parameters, exceptional on agentic intellegence • 5 items • Updated 18 days ago • 154

upvoted a paper 7 months ago

AM-Thinking-v1: Advancing the Frontier of Reasoning at 32B Scale

Paper • 2505.08311 • Published May 13 • 18