deepseek-ai/DeepSeek-R1-Distill-Qwen-32B Text Generation • 33B • Updated Feb 24, 2025 • 953k • • 1.53k
princeton-nlp/Llama-3-8B-ProLong-64k-Instruct Text Generation • 8B • Updated Oct 31, 2024 • 7.94k • • 13
Running on CPU Upgrade Featured 1.01k Model Memory Utility 🚀 1.01k Calculate VRAM needed to train and run Hugging Face models