Zhenxing Mi's picture

Zhenxing Mi

Mifucius

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 10 days ago

LongVideoAgent: Multi-Agent Reasoning with Long Videos

upvoted a paper 10 days ago

SemanticGen: Video Generation in Semantic Space

upvoted a paper 12 days ago

Seedance 1.0: Exploring the Boundaries of Video Generation Models

View all activity

Organizations

None yet

upvoted 2 papers 10 days ago

LongVideoAgent: Multi-Agent Reasoning with Long Videos

Paper • 2512.20618 • Published 11 days ago • 52

SemanticGen: Video Generation in Semantic Space

Paper • 2512.20619 • Published 11 days ago • 88

upvoted a paper 12 days ago

Seedance 1.0: Exploring the Boundaries of Video Generation Models

Paper • 2506.09113 • Published Jun 10, 2025 • 105

upvoted 2 papers 15 days ago

Kling-Omni Technical Report

Paper • 2512.16776 • Published 16 days ago • 163

N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models

Paper • 2512.16561 • Published 16 days ago • 19

upvoted 2 papers about 1 month ago

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Paper • 2511.21689 • Published Nov 26, 2025 • 111

One4D: Unified 4D Generation and Reconstruction via Decoupled LoRA Control

Paper • 2511.18922 • Published Nov 24, 2025 • 11

upvoted a paper 3 months ago

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

Paper • 2510.15870 • Published Oct 17, 2025 • 89

upvoted a paper 6 months ago

Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation

Paper • 2504.02542 • Published Apr 3, 2025 • 51

upvoted 2 papers 7 months ago

Autoregressive Adversarial Post-Training for Real-Time Interactive Video Generation

Paper • 2506.09350 • Published Jun 11, 2025 • 48

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30, 2025 • 143

upvoted a paper 9 months ago

CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Paper • 2504.13161 • Published Apr 17, 2025 • 93

upvoted 4 papers 10 months ago

Qwen2.5-1M Technical Report

Paper • 2501.15383 • Published Jan 26, 2025 • 72

Personalize Anything for Free with Diffusion Transformer

Paper • 2503.12590 • Published Mar 16, 2025 • 44

DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers

Paper • 2503.14487 • Published Mar 18, 2025 • 28

BlobCtrl: A Unified and Flexible Framework for Element-level Image Generation and Editing

Paper • 2503.13434 • Published Mar 17, 2025 • 27

upvoted 4 papers 11 months ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published Feb 20, 2025 • 156

Dynamic Concepts Personalization from Single Videos

Paper • 2502.14844 • Published Feb 20, 2025 • 16

RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning

Paper • 2502.13144 • Published Feb 18, 2025 • 38

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19, 2025 • 212