Hugging Face Party @ PyTorch Conference

community

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

julien-c submitted a paper 8 days ago

Shaping capabilities with token-level data filtering

andreaskoepf authored a paper 23 days ago

PyTorch: An Imperative Style, High-Performance Deep Learning Library

andreaskoepf authored a paper 23 days ago

MEDITRON-70B: Scaling Medical Pretraining for Large Language Models

View all activity

danielhanchen

posted an update 2 days ago

Post

3049

We created a tool-calling guide for local LLMs!

Learn how to use any open model like Qwen3-Coder-Next and GLM-4.7-Flash for function calling.

Guide: https://unsloth.ai/docs/basics/tool-calling-guide-for-local-llms

We provide hands-on examples for: story writing, Python execution, terminal tool calls, maths and more.

6 replies

danielhanchen

posted an update 4 days ago

Post

3574

Qwen releases Qwen3-Coder-Next! 💜 Run the locally on 46GB RAM or less.

Thhe model excels at agentic coding & local use. With 256K context, it delivers similar performance to models with 10-20× more active parameters.

GGUF: unsloth/Qwen3-Coder-Next-GGUF
Guide: https://unsloth.ai/docs/models/qwen3-coder-next

9 replies

danielhanchen

posted an update 10 days ago

Post

3367

You can now run Kimi K2.5 locally! 🔥

We shrank the 1T model to 240GB (-60%) via Dynamic 1-bit.
Get >40 tok/s on 242GB or 622GB VRAM/RAM for near full precision.

GGUF: unsloth/Kimi-K2.5-GGUF

Guide: https://unsloth.ai/docs/models/kimi-k2.5

7 replies

danielhanchen

posted an update 16 days ago

Post

2585

You can now fine-tune embedding models in our free Unsloth notebook! 🤗

Fine-tuning embedding models improves retrieval & RAG by aligning vectors to your domain-specific notion of similarity, improving search, clustering, and recommendations on your data.

⭐ Blog + Notebooks: https://unsloth.ai/docs/new/embedding-finetuning

Unsloth trains embedding models 1.8-3.3x faster with 20% less VRAM, 2x longer context & no accuracy loss vs. FA2 setups.

We'd like to thank Hugging Face and Unsloth contributor: electroglyph for making this possible!

3 replies

danielhanchen

posted an update 19 days ago

Post

2608

Run GLM-4.7-Flash locally on your device with 24GB RAM!🔥

It's the best performing 30B model on SWE-Bench and GPQA. With 200K context, it excels at coding, agents, chat & reasoning.

GGUF: unsloth/GLM-4.7-Flash-GGUF

Guide: https://unsloth.ai/docs/models/glm-4.7-flash

danielhanchen

posted an update 23 days ago

Post

2831

You can now do reinforcement learning training with 7× longer context and no accuracy loss, via our new batching algorithms.

Long reasoning chains in RL are costly, but now we enable you to train gpt-oss with GRPO & reach 380K context on a 192GB GPU.

Blog: https://unsloth.ai/docs/new/grpo-long-context

danielhanchen

posted an update about 1 month ago

Post

954

Run Qwen-Image-2512, the new SOTA text-to-image model! 💜

It's the top performing open diffusion model and has more realistic + accurate images/text.

Run locally with 14GB RAM via our Dynamic GGUF: unsloth/Qwen-Image-2512-GGUF

Guide: https://unsloth.ai/docs/models/qwen-image-2512

2 replies

danielhanchen

posted an update about 2 months ago

Post

4096

You can now run GLM-4.7, the new 355B parameter SOTA model on your local device (128GB RAM).✨

The model achieves SOTA performance on coding, agentic and chat benchmarks.

GGUF: unsloth/GLM-4.7-GGUF
Guide: https://docs.unsloth.ai/models/glm-4.7

3 replies

danielhanchen

posted an update about 2 months ago

Post

2484

Google releases FunctionGemma, a new 270M parameter model that runs on just 0.5 GB RAM.✨

Built for tool-calling, run locally on your phone at 50+ tokens/s, or fine-tune with Unsloth & deploy to your phone.

GGUF: unsloth/functiongemma-270m-it-GGUF
Docs + Notebook: https://docs.unsloth.ai/models/functiongemma

2 replies

danielhanchen

posted an update about 2 months ago

Post

5531

NVIDIA releases Nemotron 3 Nano, a new 30B hybrid reasoning model! 🔥

Has 1M context window & best in class performance for SWE-Bench, reasoning & chat. Run the MoE model locally with 24GB RAM.

GGUF: unsloth/Nemotron-3-Nano-30B-A3B-GGUF
💚 Step-by-step Guide: https://docs.unsloth.ai/models/nemotron-3

1 reply

woojun-jung

authored a paper about 2 months ago

Visual Funnel: Resolving Contextual Blindness in Multimodal Large Language Models

Paper • 2512.10362 • Published Dec 11, 2025 • 1

danielhanchen

posted an update about 2 months ago

Post

2185

Mistral's new SOTA coding models Devstral 2 can now be Run locally! (25GB RAM) 🐱
We fixed the chat template, so performance should be much better now!
24B: unsloth/Devstral-Small-2-24B-Instruct-2512-GGUF
123B: unsloth/Devstral-2-123B-Instruct-2512-GGUF

🧡Step-by-step Guide: https://docs.unsloth.ai/models/devstral-2

danielhanchen

posted an update 2 months ago

Post

3860

Mistral's new Ministral 3 models can now be Run & Fine-tuned locally! (16GB RAM)
Ministral 3 have vision support and the best-in-class performance for their sizes.
14B Instruct GGUF: unsloth/Ministral-3-14B-Instruct-2512-GGUF
14B Reasoning GGUF: unsloth/Ministral-3-14B-Reasoning-2512-GGUF

🐱 Step-by-step Guide: https://docs.unsloth.ai/new/ministral-3
All GGUFs, BnB, FP8 etc. variants uploads: https://huggingface.co/collections/unsloth/ministral-3

3 replies

danielhanchen

posted an update 2 months ago

Post

8575

Qwen3-Next can now be Run locally! (30GB RAM)
Instruct GGUF: unsloth/Qwen3-Next-80B-A3B-Instruct-GGUF

The models come in Thinking and Instruct versions and utilize a new architecture, allowing it to have ~10x faster inference than Qwen32B.
💜 Step-by-step Guide: https://docs.unsloth.ai/models/qwen3-next

Thinking GGUF: unsloth/Qwen3-Next-80B-A3B-Thinking-GGUF

danielhanchen

posted an update 3 months ago

Post

4451

You can now run Kimi K2 Thinking locally with our Dynamic 1-bit GGUFs: unsloth/Kimi-K2-Thinking-GGUF

We shrank the 1T model to 245GB (-62%) & retained ~85% of accuracy on Aider Polyglot. Run on >247GB RAM for fast inference.

We also collaborated with the Moonshot AI Kimi team on a system prompt fix! 🥰

Guide + fix details: https://docs.unsloth.ai/models/kimi-k2-thinking-how-to-run-locally