3 6 11

Han Shi

shihan96

AI & ML interests

None yet

Recent Activity

liked a Space 10 days ago

nanotron/ultrascale-playbook

upvoted a paper 2 months ago

Symbolic Graphics Programming with Large Language Models

liked a model 4 months ago

transformers-community/sep_cache

View all activity

Organizations

liked a Space 10 days ago

The Ultra-Scale Playbook

🌌

3.51k

The ultimate guide to training LLM on large GPU Clusters

upvoted a paper 2 months ago

Symbolic Graphics Programming with Large Language Models

Paper • 2509.05208 • Published Sep 5 • 45

liked a model 4 months ago

transformers-community/sep_cache

8B • Updated Aug 4 • 12 • 9

commented a paper 11 months ago

SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator

Paper • 2412.12094 • Published Dec 16, 2024 • 11 •

upvoted a paper 11 months ago

DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis

Paper • 2405.14224 • Published May 23, 2024 • 16

authored a paper 11 months ago

SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator

Paper • 2412.12094 • Published Dec 16, 2024 • 11

upvoted a paper 11 months ago

SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator

Paper • 2412.12094 • Published Dec 16, 2024 • 11

commented a paper 11 months ago

SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator

Paper • 2412.12094 • Published Dec 16, 2024 • 11 •

authored 2 papers about 1 year ago

DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis

Paper • 2405.14224 • Published May 23, 2024 • 16

Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding

Paper • 2410.01699 • Published Oct 2, 2024 • 18

upvoted a paper about 1 year ago

Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding

Paper • 2410.01699 • Published Oct 2, 2024 • 18

liked a model over 1 year ago

meta-llama/Meta-Llama-3-8B-Instruct

Text Generation • 8B • Updated Jun 18 • 1.01M • • 4.3k

liked a model about 2 years ago

meta-math/MetaMath-Llemma-7B

Text Generation • Updated Dec 21, 2023 • 838 • 17

New activity in meta-math/MetaMath-Mistral-7B about 2 years ago

Update README.md

#1 opened about 2 years ago by

hoan

upvoted a paper about 2 years ago

Forward-Backward Reasoning in Large Language Models for Mathematical Verification

Paper • 2308.07758 • Published Aug 15, 2023 • 4

liked a model about 2 years ago

meta-math/MetaMath-Mistral-7B

Text Generation • Updated Dec 21, 2023 • 2.19k • 96

liked 2 datasets about 2 years ago

meta-math/GSM8K_Backward

Viewer • Updated Nov 10, 2023 • 1.27k • 35 • 18

meta-math/MetaMathQA-40K

Viewer • Updated Nov 10, 2023 • 40k • 977 • 25

authored a paper about 2 years ago

Forward-Backward Reasoning in Large Language Models for Mathematical Verification

Paper • 2308.07758 • Published Aug 15, 2023 • 4

upvoted a paper about 2 years ago

MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

Paper • 2309.12284 • Published Sep 21, 2023 • 18

Han Shi

AI & ML interests

Recent Activity

Organizations

shihan96's activity

The Ultra-Scale Playbook

Update README.md