Minshen Zhang
alexzms
AI & ML interests
AI, DL, ML, GPT
Recent Activity
upvoted
a
paper
about 1 month ago
Efficient Long-context Language Model Training by Core Attention
Disaggregation
upvoted
a
paper
about 2 months ago
DeepPrune: Parallel Scaling without Inter-trace Redundancy