Spaces:
Sleeping
Sleeping
| # π Deployment Guide - Hugging Face Spaces | |
| ## Complete step-by-step guide to deploy Madrid Content Analyzer on Hugging Face Spaces | |
| **Cost: $0/month forever!** π | |
| --- | |
| ## π Prerequisites | |
| 1. β Hugging Face account (free!) | |
| 2. β Git installed on your computer | |
| 3. β Your Aclarador code ready | |
| **Time needed**: 30-60 minutes | |
| --- | |
| ## Step 1: Create Hugging Face Account (5 minutes) | |
| ### 1.1 Sign Up | |
| ``` | |
| 1. Go to https://huggingface.co | |
| 2. Click "Sign Up" | |
| 3. Use email or GitHub | |
| 4. Verify your email | |
| ``` | |
| **No credit card required!** β | |
| ### 1.2 Get Access Token | |
| ``` | |
| 1. Go to https://huggingface.co/settings/tokens | |
| 2. Click "New token" | |
| 3. Name it "madrid-analyzer" | |
| 4. Select "write" permissions | |
| 5. Click "Generate" | |
| 6. Copy the token (save it safely!) | |
| ``` | |
| --- | |
| ## Step 2: Create Your Space (3 minutes) | |
| ### 2.1 Create Space | |
| ``` | |
| 1. Go to https://huggingface.co/new-space | |
| 2. Fill in: | |
| - Owner: your username | |
| - Space name: madrid-content-analyzer | |
| - License: MIT | |
| - Select SDK: Gradio | |
| - SDK version: 4.44.0 | |
| - Hardware: CPU basic (free!) | |
| - Visibility: Public (or Private if you prefer) | |
| 3. Click "Create Space" | |
| ``` | |
| ### 2.2 Note Your Space URL | |
| Your space will be at: | |
| ``` | |
| https://huggingface.co/spaces/YOUR_USERNAME/madrid-content-analyzer | |
| ``` | |
| --- | |
| ## Step 3: Clone and Setup Locally (10 minutes) | |
| ### 3.1 Clone Your Space | |
| ```bash | |
| # Clone the empty space | |
| git clone https://huggingface.co/spaces/YOUR_USERNAME/madrid-content-analyzer | |
| cd madrid-content-analyzer | |
| # Or if it asks for credentials: | |
| git clone https://YOUR_USERNAME:[email protected]/spaces/YOUR_USERNAME/madrid-content-analyzer | |
| ``` | |
| ### 3.2 Copy the Adapted Code | |
| ```bash | |
| # You have two options: | |
| # Option A: Download from outputs folder | |
| # Copy everything from /mnt/user-data/outputs/madrid-analyzer-hf/ | |
| # to your local madrid-content-analyzer/ folder | |
| # Option B: Copy manually | |
| # Files needed (I've created them for you): | |
| # - app.py | |
| # - requirements.txt | |
| # - README.md | |
| # - config/database.py | |
| # - config/settings.py | |
| # - storage/repository.py | |
| # - storage/models.py | |
| # - fetchers/rss_fetcher.py | |
| # - fetchers/api_fetcher.py | |
| # - analyzers/analyzer_wrapper.py | |
| # - schedulers/background_tasks.py | |
| # - utils/logger.py | |
| # - utils/text_cleaner.py | |
| ``` | |
| ### 3.3 Add Your Aclarador Code | |
| ```bash | |
| # Create analyzers directory if not exists | |
| mkdir -p analyzers/aclarador | |
| # Copy your Aclarador code | |
| cp -r /path/to/your/aclarador/* analyzers/aclarador/ | |
| # Your Aclarador analyzer should be importable as: | |
| # from analyzers.aclarador import your_analysis_function | |
| ``` | |
| ### 3.4 Update analyzer_wrapper.py | |
| Edit `analyzers/analyzer_wrapper.py` to integrate your Aclarador: | |
| ```python | |
| # Import your actual Aclarador function | |
| from analyzers.aclarador.your_module import analyze_text | |
| class AclaradorAnalyzer: | |
| def analyze(self, text, title=None): | |
| # Call your actual analysis function | |
| result = analyze_text(text) | |
| # Map to expected format | |
| return { | |
| 'overall_score': result.get('clarity_score', 0), | |
| 'readability_score': result.get('readability', 0), | |
| 'complexity_score': result.get('complexity', 0), | |
| 'sentence_stats': result.get('sentence_analysis', {}), | |
| 'vocabulary_stats': result.get('vocabulary', {}), | |
| 'jargon_count': len(result.get('jargon_terms', [])), | |
| 'jargon_words': result.get('jargon_terms', []), | |
| 'suggestions': result.get('recommendations', []) | |
| } | |
| ``` | |
| --- | |
| ## Step 4: Deploy to Hugging Face (5 minutes) | |
| ### 4.1 Configure Git | |
| ```bash | |
| # Set your git email and name | |
| git config user.email "[email protected]" | |
| git config user.name "Your Name" | |
| ``` | |
| ### 4.2 Commit Everything | |
| ```bash | |
| # Add all files | |
| git add . | |
| # Commit | |
| git commit -m "Initial deployment of Madrid Content Analyzer" | |
| ``` | |
| ### 4.3 Push to Hugging Face | |
| ```bash | |
| # Push to deploy | |
| git push | |
| # If it asks for credentials: | |
| # Username: YOUR_USERNAME | |
| # Password: YOUR_TOKEN (from Step 1.2) | |
| ``` | |
| ### 4.4 Watch Build | |
| ``` | |
| 1. Go to your Space URL | |
| 2. You'll see "Building..." status | |
| 3. Watch the logs | |
| 4. Takes 2-5 minutes | |
| ``` | |
| --- | |
| ## Step 5: Verify Deployment (5 minutes) | |
| ### 5.1 Check Space is Running | |
| ``` | |
| 1. Go to your Space URL | |
| 2. You should see the Gradio interface | |
| 3. It might show "Waiting for app to start..." | |
| 4. Give it 30-60 seconds | |
| ``` | |
| ### 5.2 Test Dashboard | |
| ``` | |
| 1. Click "Dashboard" tab | |
| 2. Click "π Refresh Statistics" | |
| 3. Should show initial stats (might be empty) | |
| ``` | |
| ### 5.3 Trigger First Fetch | |
| ``` | |
| 1. Go to "Settings" tab | |
| 2. Click "π Trigger Manual Fetch" | |
| 3. Wait 1-2 minutes | |
| 4. Go back to Dashboard | |
| 5. Click refresh - you should see data! | |
| ``` | |
| --- | |
| ## Step 6: Configure Secrets (Optional) | |
| If your Aclarador needs API keys or configuration: | |
| ### 6.1 Add Secrets | |
| ``` | |
| 1. Go to your Space settings | |
| 2. Click "Repository secrets" | |
| 3. Add secrets: | |
| - Name: ACLARADOR_API_KEY | |
| - Value: your-key | |
| 4. Restart space | |
| ``` | |
| ### 6.2 Access Secrets in Code | |
| ```python | |
| import os | |
| api_key = os.getenv('ACLARADOR_API_KEY') | |
| ``` | |
| --- | |
| ## Step 7: Make Space Public/Private | |
| ### 7.1 Public Space (Recommended) | |
| ``` | |
| - Anyone can view | |
| - Good for showcasing | |
| - Free forever | |
| ``` | |
| ### 7.2 Private Space | |
| ``` | |
| 1. Go to Settings | |
| 2. Change visibility to "Private" | |
| 3. Only you can access | |
| 4. Still free! | |
| ``` | |
| --- | |
| ## π You're Live! | |
| Your Madrid Content Analyzer is now running **FREE forever** on Hugging Face Spaces! | |
| **Your Space URL**: | |
| ``` | |
| https://huggingface.co/spaces/YOUR_USERNAME/madrid-content-analyzer | |
| ``` | |
| **Share it**: | |
| ``` | |
| https://YOUR_USERNAME-madrid-content-analyzer.hf.space | |
| ``` | |
| --- | |
| ## π§ Post-Deployment Configuration | |
| ### Update Fetch Frequency | |
| Edit `app.py` line ~35: | |
| ```python | |
| scheduler.add_job( | |
| fetch_and_analyze_content, | |
| 'interval', | |
| hours=6, # Change this! (1, 6, 12, 24) | |
| id='content_fetch' | |
| ) | |
| ``` | |
| Commit and push: | |
| ```bash | |
| git add app.py | |
| git commit -m "Update fetch frequency" | |
| git push | |
| ``` | |
| ### Update Data Retention | |
| Add cleanup job in `app.py`: | |
| ```python | |
| scheduler.add_job( | |
| cleanup_old_data, | |
| 'interval', | |
| days=7, # Run weekly | |
| id='cleanup' | |
| ) | |
| ``` | |
| --- | |
| ## π Monitoring Your Space | |
| ### Check Logs | |
| ``` | |
| 1. Go to your Space | |
| 2. Click "App files" β "Logs" | |
| 3. See real-time logs | |
| ``` | |
| ### Check Database Size | |
| ``` | |
| 1. Go to "Settings" tab in your app | |
| 2. Click "Refresh Database Stats" | |
| 3. See "Database Size" in MB | |
| ``` | |
| ### Space is at 16GB limit? | |
| ``` | |
| 1. Go to Settings tab | |
| 2. Run cleanup manually | |
| 3. Or decrease data retention period | |
| ``` | |
| --- | |
| ## π Updating Your Space | |
| ### Update Code | |
| ```bash | |
| # Make changes locally | |
| # Edit files | |
| # Commit and push | |
| git add . | |
| git commit -m "Update: your changes" | |
| git push | |
| # Space rebuilds automatically! | |
| ``` | |
| ### Update Dependencies | |
| ```bash | |
| # Edit requirements.txt | |
| # Add/update packages | |
| # Commit and push | |
| git add requirements.txt | |
| git commit -m "Update dependencies" | |
| git push | |
| ``` | |
| --- | |
| ## π Troubleshooting | |
| ### Space won't start | |
| **Check logs**: Look for errors in Space logs | |
| **Common issues**: | |
| - Missing dependencies in requirements.txt | |
| - Import errors in code | |
| - Database permission issues | |
| **Solution**: | |
| ```bash | |
| # Check requirements.txt has all deps | |
| pip install -r requirements.txt # Test locally first | |
| # Check imports work | |
| python app.py # Test locally | |
| ``` | |
| ### Database not persisting | |
| **Issue**: Data disappears after restart | |
| **Solution**: Make sure using `/data/` directory | |
| ```python | |
| DB_PATH = '/data/madrid.duckdb' # β Correct | |
| DB_PATH = 'madrid.duckdb' # β Wrong (ephemeral!) | |
| ``` | |
| ### Scheduler not running | |
| **Issue**: No automatic fetches | |
| **Check**: Background scheduler started | |
| ```python | |
| scheduler.start() # Make sure this is called! | |
| ``` | |
| ### Out of memory | |
| **Issue**: Space crashes with memory error | |
| **Solution**: | |
| 1. Reduce fetch batch size | |
| 2. Add pagination to queries | |
| 3. Upgrade to better hardware (paid) | |
| ### Import errors | |
| **Issue**: Can't import Aclarador | |
| **Solution**: | |
| ```bash | |
| # Check your analyzer structure | |
| analyzers/ | |
| aclarador/ | |
| __init__.py # Make sure this exists! | |
| your_code.py | |
| ``` | |
| --- | |
| ## π‘ Pro Tips | |
| ### Tip 1: Test Locally First | |
| ```bash | |
| # Before pushing, test locally | |
| python app.py | |
| # Visit http://localhost:7860 | |
| # Make sure everything works! | |
| ``` | |
| ### Tip 2: Use .gitignore | |
| Create `.gitignore`: | |
| ``` | |
| __pycache__/ | |
| *.pyc | |
| .env | |
| *.duckdb | |
| .DS_Store | |
| ``` | |
| ### Tip 3: Add Status Badge | |
| Add to your Space README.md: | |
| ```markdown | |
|  | |
| ``` | |
| ### Tip 4: Monitor Resource Usage | |
| HF Spaces shows CPU/Memory usage in Space settings | |
| ### Tip 5: Version Your Data | |
| Before major changes: | |
| ```python | |
| # Export data | |
| export_data('csv') | |
| # Make changes | |
| # Can restore if needed | |
| ``` | |
| --- | |
| ## π Scaling Up (If Needed) | |
| ### If You Outgrow Free Tier | |
| **Paid Hardware Options**: | |
| - **CPU Upgrade**: $0.03/hour (~$22/month) | |
| - **Basic GPU**: $0.60/hour (~$432/month) | |
| **But you probably won't need it!** | |
| - Free tier handles 100K+ items easily | |
| - DuckDB is very efficient | |
| - 16GB is plenty | |
| --- | |
| ## β Deployment Checklist | |
| ### Before Deployment | |
| - [ ] Hugging Face account created | |
| - [ ] Access token generated | |
| - [ ] Space created | |
| - [ ] Code copied to local clone | |
| - [ ] Aclarador integrated | |
| - [ ] Tested locally | |
| ### During Deployment | |
| - [ ] All files committed | |
| - [ ] Pushed to Hugging Face | |
| - [ ] Build successful | |
| - [ ] App starts without errors | |
| ### After Deployment | |
| - [ ] Dashboard loads | |
| - [ ] Manual fetch works | |
| - [ ] Data persists | |
| - [ ] Scheduler running | |
| - [ ] Analysis working | |
| ### Post-Launch | |
| - [ ] Set visibility (public/private) | |
| - [ ] Share URL | |
| - [ ] Monitor first few fetches | |
| - [ ] Check database size | |
| - [ ] Verify automatic fetches | |
| --- | |
| ## π You Did It! | |
| You now have a **completely free** Madrid Content Analyzer running 24/7! | |
| **What you saved**: | |
| - Heroku: $84-168/year | |
| - Server costs: $0/month | |
| - Database: $0/month (vs $5-60/month) | |
| **What you got**: | |
| - Modern Gradio interface | |
| - Fast DuckDB analytics | |
| - 16GB storage | |
| - Always-on service | |
| - Beautiful visualizations | |
| --- | |
| ## π Need Help? | |
| **Hugging Face Community**: | |
| - Forums: https://discuss.huggingface.co | |
| - Discord: https://hf.co/join/discord | |
| - Documentation: https://huggingface.co/docs/hub/spaces | |
| **Check Your Space Logs**: | |
| - App files β Logs | |
| - See errors in real-time | |
| --- | |
| **Congratulations! You're now running on Hugging Face Spaces! π** | |
| **Next**: Share your Space URL and start analyzing Madrid's content! | |