Spaces:
Sleeping
Sleeping
| # π Deployment Guide for HuggingFace Space with ZeroGPU | |
| ## β Pre-Deployment Checklist | |
| All code is ready! Here's what's configured: | |
| - β Model: `microsoft/Phi-3-mini-4k-instruct` (3.8B params) | |
| - β ZeroGPU support: Enabled with `@spaces.GPU` decorator | |
| - β Local/Space compatibility: Auto-detects environment | |
| - β Usage tracking: 50 requests/day per user | |
| - β Requirements: All dependencies listed | |
| - β README: Updated with instructions | |
| ## π Deployment Steps | |
| ### Step 1: Push Code to Your Space | |
| ```bash | |
| cd /Users/tom/code/cojournalist-data | |
| # If not already initialized | |
| git init | |
| git remote add space https://huggingface.co/spaces/YOUR_USERNAME/cojournalist-data | |
| # Or if already connected | |
| git add . | |
| git commit -m "Deploy Phi-3-mini with ZeroGPU and usage tracking" | |
| git push space main | |
| ``` | |
| ### Step 2: Configure Space Hardware | |
| 1. Go to your Space: `https://huggingface.co/spaces/YOUR_USERNAME/cojournalist-data` | |
| 2. Click **Settings** (βοΈ icon in top right) | |
| 3. Scroll to **Hardware** section | |
| 4. Select **ZeroGPU** from dropdown | |
| 5. Click **Save** | |
| 6. Space will restart automatically | |
| ### Step 3: Wait for Build | |
| The Space will: | |
| 1. Install dependencies (~2-3 minutes) | |
| 2. Download Phi-3-mini model (~1-2 minutes, 7.6GB) | |
| 3. Load model into memory (~30 seconds) | |
| 4. Launch Gradio interface | |
| **Total build time: ~5-7 minutes** | |
| ### Step 4: Test Your Space | |
| Once running, test with these queries: | |
| 1. **English:** "Who are the parliamentarians from Zurich?" | |
| 2. **German:** "Zeige mir aktuelle Abstimmungen zur Klimapolitik" | |
| 3. **French:** "Qui sont les parlementaires de Zurich?" | |
| 4. **Italian:** "Mostrami i voti recenti sulla politica climatica" | |
| ## π§ Space Settings Summary | |
| ### Hardware | |
| - **Type:** ZeroGPU | |
| - **Cost:** FREE (included with Team plan) | |
| - **GPU:** Nvidia H200 (70GB VRAM) | |
| - **Allocation:** Dynamic (only when needed) | |
| ### Environment Variables (Optional) | |
| If you want to configure anything: | |
| - `HF_TOKEN`: Your HuggingFace token (for private models, not needed for Phi-3) | |
| ## π Expected Behavior | |
| ### First Request | |
| - Takes ~5-10 seconds (GPU allocation + inference) | |
| - Subsequent requests faster (~2-5 seconds) | |
| ### Rate Limiting | |
| - 50 requests per day per user IP | |
| - Error message shown when limit reached | |
| - Resets daily at midnight UTC | |
| ### Model Loading | |
| - Happens once on Space startup | |
| - Cached for subsequent requests | |
| - No reload needed between requests | |
| ## π Troubleshooting | |
| ### "Model not loading" | |
| - Check Space logs for errors | |
| - Verify ZeroGPU is selected in Hardware settings | |
| - Ensure `spaces>=0.28.0` in requirements.txt | |
| ### "Out of memory" | |
| - This shouldn't happen with ZeroGPU (70GB VRAM) | |
| - If it does, contact HF support | |
| ### "Rate limit not working" | |
| - Usage tracker uses in-memory storage | |
| - Resets on Space restart | |
| - IP-based tracking (works in production) | |
| ### "Slow inference" | |
| - First request allocates GPU (slower) | |
| - Subsequent requests use cached allocation | |
| - Normal: 2-5 seconds per request | |
| ## π° Cost Breakdown | |
| - **Team Plan:** $20/user/month (you already have this) | |
| - **ZeroGPU:** FREE (included) | |
| - **Inference:** FREE (no API calls) | |
| - **Storage:** FREE (model cached by HF) | |
| **Total additional cost: $0/month** π | |
| ## π Updates & Maintenance | |
| To update your Space: | |
| ```bash | |
| # Make changes to code | |
| git add . | |
| git commit -m "Update: description of changes" | |
| git push space main | |
| ``` | |
| Space will automatically rebuild and redeploy. | |
| ## π Monitoring Usage | |
| Check your Space's metrics: | |
| 1. Go to Space page | |
| 2. Click "Analytics" tab | |
| 3. View daily/weekly usage stats | |
| ## π― Next Steps After Deployment | |
| 1. β Test all 4 languages | |
| 2. β Verify tool calling works | |
| 3. β Check rate limiting | |
| 4. β Monitor performance | |
| 5. π Adjust system prompt if needed | |
| 6. π Fine-tune temperature/max_tokens if needed | |
| ## π Support | |
| If you encounter issues: | |
| - Check Space logs (Settings β Logs) | |
| - HuggingFace Discord: https://discord.gg/huggingface | |
| - HF Forums: https://discuss.huggingface.co/ | |
| --- | |
| **You're ready to deploy! π** | |