Spaces:

tomvaillant
/

cojournalist-data

Sleeping

App Files Files Community

cojournalist-data / DEPLOYMENT.md

Tom

Deploy Phi-3-mini with ZeroGPU and 50 req/day limit

c7dcc92 about 2 months ago

preview code

raw

history blame contribute delete

3.99 kB

	# 🚀 Deployment Guide for HuggingFace Space with ZeroGPU

	## ✅ Pre-Deployment Checklist

	All code is ready! Here's what's configured:

	- ✅ Model: `microsoft/Phi-3-mini-4k-instruct` (3.8B params)
	- ✅ ZeroGPU support: Enabled with `@spaces.GPU` decorator
	- ✅ Local/Space compatibility: Auto-detects environment
	- ✅ Usage tracking: 50 requests/day per user
	- ✅ Requirements: All dependencies listed
	- ✅ README: Updated with instructions

	## 📋 Deployment Steps

	### Step 1: Push Code to Your Space

	```bash
	cd /Users/tom/code/cojournalist-data

	# If not already initialized
	git init
	git remote add space https://huggingface.co/spaces/YOUR_USERNAME/cojournalist-data

	# Or if already connected
	git add .
	git commit -m "Deploy Phi-3-mini with ZeroGPU and usage tracking"
	git push space main
	```

	### Step 2: Configure Space Hardware

	1. Go to your Space: `https://huggingface.co/spaces/YOUR_USERNAME/cojournalist-data`
	2. Click Settings (⚙️ icon in top right)
	3. Scroll to Hardware section
	4. Select ZeroGPU from dropdown
	5. Click Save
	6. Space will restart automatically

	### Step 3: Wait for Build

	The Space will:
	1. Install dependencies (~2-3 minutes)
	2. Download Phi-3-mini model (~1-2 minutes, 7.6GB)
	3. Load model into memory (~30 seconds)
	4. Launch Gradio interface

	Total build time: ~5-7 minutes

	### Step 4: Test Your Space

	Once running, test with these queries:

	1. English: "Who are the parliamentarians from Zurich?"
	2. German: "Zeige mir aktuelle Abstimmungen zur Klimapolitik"
	3. French: "Qui sont les parlementaires de Zurich?"
	4. Italian: "Mostrami i voti recenti sulla politica climatica"

	## 🔧 Space Settings Summary

	### Hardware
	- Type: ZeroGPU
	- Cost: FREE (included with Team plan)
	- GPU: Nvidia H200 (70GB VRAM)
	- Allocation: Dynamic (only when needed)

	### Environment Variables (Optional)
	If you want to configure anything:
	- `HF_TOKEN`: Your HuggingFace token (for private models, not needed for Phi-3)

	## 📊 Expected Behavior

	### First Request
	- Takes ~5-10 seconds (GPU allocation + inference)
	- Subsequent requests faster (~2-5 seconds)

	### Rate Limiting
	- 50 requests per day per user IP
	- Error message shown when limit reached
	- Resets daily at midnight UTC

	### Model Loading
	- Happens once on Space startup
	- Cached for subsequent requests
	- No reload needed between requests

	## 🐛 Troubleshooting

	### "Model not loading"
	- Check Space logs for errors
	- Verify ZeroGPU is selected in Hardware settings
	- Ensure `spaces>=0.28.0` in requirements.txt

	### "Out of memory"
	- This shouldn't happen with ZeroGPU (70GB VRAM)
	- If it does, contact HF support

	### "Rate limit not working"
	- Usage tracker uses in-memory storage
	- Resets on Space restart
	- IP-based tracking (works in production)

	### "Slow inference"
	- First request allocates GPU (slower)
	- Subsequent requests use cached allocation
	- Normal: 2-5 seconds per request

	## 💰 Cost Breakdown

	- Team Plan: $20/user/month (you already have this)
	- ZeroGPU: FREE (included)
	- Inference: FREE (no API calls)
	- Storage: FREE (model cached by HF)

	Total additional cost: $0/month 🎉

	## 🔄 Updates & Maintenance

	To update your Space:
	```bash
	# Make changes to code
	git add .
	git commit -m "Update: description of changes"
	git push space main
	```

	Space will automatically rebuild and redeploy.

	## 📈 Monitoring Usage

	Check your Space's metrics:
	1. Go to Space page
	2. Click "Analytics" tab
	3. View daily/weekly usage stats

	## 🎯 Next Steps After Deployment

	1. ✅ Test all 4 languages
	2. ✅ Verify tool calling works
	3. ✅ Check rate limiting
	4. ✅ Monitor performance
	5. 🔜 Adjust system prompt if needed
	6. 🔜 Fine-tune temperature/max_tokens if needed

	## 📞 Support

	If you encounter issues:
	- Check Space logs (Settings → Logs)
	- HuggingFace Discord: https://discord.gg/huggingface
	- HF Forums: https://discuss.huggingface.co/

	---

	You're ready to deploy! 🚀