13 20 3

Haocheng Xi

xihc-ucb

xijiu9

AI & ML interests

Efficient ML

Recent Activity

new activity 4 days ago

xihc-ucb/Qwen2.5-7B-train-Quasar-1002:Upload FP8Qwen2ForCausalLM

new activity 4 days ago

xihc-ucb/Qwen2.5-7B-Instruct-train-Quasar-1002:Upload FP8Qwen2ForCausalLM

new activity 4 days ago

xihc-ucb/Qwen2.5-7B-Instruct-train-Quasar-1002:force push

View all activity

Organizations

upvoted a paper 16 days ago

Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference

Paper • 2406.10774 • Published Jun 16, 2024 • 4

upvoted a paper 25 days ago

Video-As-Prompt: Unified Semantic Control for Video Generation

Paper • 2510.20888 • Published 29 days ago • 44

upvoted 2 papers 29 days ago

AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders

Paper • 2510.19779 • Published about 1 month ago • 58

Attention Sinks in Diffusion Language Models

Paper • 2510.15731 • Published Oct 17 • 48

upvoted a paper about 1 month ago

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

Paper • 2510.11696 • Published Oct 13 • 173

upvoted 4 papers about 2 months ago

SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention

Paper • 2509.24006 • Published Sep 28 • 115

upvoted a collection about 2 months ago

Jet-Nemotron

Collection

2 items • Updated Sep 28 • 15

upvoted a paper 3 months ago

XQuant: Breaking the Memory Wall for LLM Inference with KV Cache Rematerialization

Paper • 2508.10395 • Published Aug 14 • 42

upvoted 2 papers 4 months ago

Deep Researcher with Test-Time Diffusion

Paper • 2507.16075 • Published Jul 21 • 66

Scaling RL to Long Videos

Paper • 2507.07966 • Published Jul 10 • 158

upvoted a collection 5 months ago

LPD

Collection

Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation • 6 items • Updated Jul 2 • 2

upvoted 2 papers 5 months ago

Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation

Paper • 2507.01957 • Published Jul 2 • 21

Radial Attention: O(nlog n) Sparse Attention with Energy Decay for Long Video Generation

Paper • 2506.19852 • Published Jun 24 • 41

upvoted 4 papers 6 months ago

Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding

Paper • 2505.22618 • Published May 28 • 44

SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training

Paper • 2505.11594 • Published May 16 • 75

Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via Semantic-Aware Permutation

Paper • 2505.18875 • Published May 24 • 42

Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity

Paper • 2502.01776 • Published Feb 3 • 3

Haocheng Xi

AI & ML interests

Recent Activity

Organizations

xihc-ucb's activity