Spaces:

rdlf
/

madriClaro

Sleeping

App Files Files Community

madriClaro / DEPLOYMENT_GUIDE.md

Ruben

Integrate Aclarador with Groq API for clarity analysis

28aa7d9 about 1 month ago

preview code

raw

history blame contribute delete

10.3 kB

A newer version of the Gradio SDK is available: 6.0.2

Upgrade

🚀 Deployment Guide - Hugging Face Spaces

Complete step-by-step guide to deploy Madrid Content Analyzer on Hugging Face Spaces

Cost: $0/month forever! 🎉

📋 Prerequisites

✅ Hugging Face account (free!)
✅ Git installed on your computer
✅ Your Aclarador code ready

Time needed: 30-60 minutes

Step 1: Create Hugging Face Account (5 minutes)

1.1 Sign Up

1. Go to https://huggingface.co
2. Click "Sign Up"
3. Use email or GitHub
4. Verify your email

No credit card required! ✅

1.2 Get Access Token

1. Go to https://huggingface.co/settings/tokens
2. Click "New token"
3. Name it "madrid-analyzer"
4. Select "write" permissions
5. Click "Generate"
6. Copy the token (save it safely!)

Step 2: Create Your Space (3 minutes)

2.1 Create Space

1. Go to https://huggingface.co/new-space
2. Fill in:
   - Owner: your username
   - Space name: madrid-content-analyzer
   - License: MIT
   - Select SDK: Gradio
   - SDK version: 4.44.0
   - Hardware: CPU basic (free!)
   - Visibility: Public (or Private if you prefer)
3. Click "Create Space"

2.2 Note Your Space URL

Your space will be at:

https://huggingface.co/spaces/YOUR_USERNAME/madrid-content-analyzer

Step 3: Clone and Setup Locally (10 minutes)

3.1 Clone Your Space

# Clone the empty space
git clone https://huggingface.co/spaces/YOUR_USERNAME/madrid-content-analyzer
cd madrid-content-analyzer

# Or if it asks for credentials:
git clone https://YOUR_USERNAME:[email protected]/spaces/YOUR_USERNAME/madrid-content-analyzer

3.2 Copy the Adapted Code

# You have two options:

# Option A: Download from outputs folder
# Copy everything from /mnt/user-data/outputs/madrid-analyzer-hf/
# to your local madrid-content-analyzer/ folder

# Option B: Copy manually
# Files needed (I've created them for you):
# - app.py
# - requirements.txt  
# - README.md
# - config/database.py
# - config/settings.py
# - storage/repository.py
# - storage/models.py
# - fetchers/rss_fetcher.py
# - fetchers/api_fetcher.py
# - analyzers/analyzer_wrapper.py
# - schedulers/background_tasks.py
# - utils/logger.py
# - utils/text_cleaner.py

3.3 Add Your Aclarador Code

# Create analyzers directory if not exists
mkdir -p analyzers/aclarador

# Copy your Aclarador code
cp -r /path/to/your/aclarador/* analyzers/aclarador/

# Your Aclarador analyzer should be importable as:
# from analyzers.aclarador import your_analysis_function

3.4 Update analyzer_wrapper.py

Edit analyzers/analyzer_wrapper.py to integrate your Aclarador:

# Import your actual Aclarador function
from analyzers.aclarador.your_module import analyze_text

class AclaradorAnalyzer:
    def analyze(self, text, title=None):
        # Call your actual analysis function
        result = analyze_text(text)
        
        # Map to expected format
        return {
            'overall_score': result.get('clarity_score', 0),
            'readability_score': result.get('readability', 0),
            'complexity_score': result.get('complexity', 0),
            'sentence_stats': result.get('sentence_analysis', {}),
            'vocabulary_stats': result.get('vocabulary', {}),
            'jargon_count': len(result.get('jargon_terms', [])),
            'jargon_words': result.get('jargon_terms', []),
            'suggestions': result.get('recommendations', [])
        }

Step 4: Deploy to Hugging Face (5 minutes)

4.1 Configure Git

# Set your git email and name
git config user.email "[email protected]"
git config user.name "Your Name"

4.2 Commit Everything

# Add all files
git add .

# Commit
git commit -m "Initial deployment of Madrid Content Analyzer"

4.3 Push to Hugging Face

# Push to deploy
git push

# If it asks for credentials:
# Username: YOUR_USERNAME
# Password: YOUR_TOKEN (from Step 1.2)

4.4 Watch Build

1. Go to your Space URL
2. You'll see "Building..." status
3. Watch the logs
4. Takes 2-5 minutes

Step 5: Verify Deployment (5 minutes)

5.1 Check Space is Running

1. Go to your Space URL
2. You should see the Gradio interface
3. It might show "Waiting for app to start..."
4. Give it 30-60 seconds

5.2 Test Dashboard

1. Click "Dashboard" tab
2. Click "🔄 Refresh Statistics"
3. Should show initial stats (might be empty)

5.3 Trigger First Fetch

1. Go to "Settings" tab
2. Click "🔄 Trigger Manual Fetch"
3. Wait 1-2 minutes
4. Go back to Dashboard
5. Click refresh - you should see data!

Step 6: Configure Secrets (Optional)

If your Aclarador needs API keys or configuration:

6.1 Add Secrets

1. Go to your Space settings
2. Click "Repository secrets"
3. Add secrets:
   - Name: ACLARADOR_API_KEY
   - Value: your-key
4. Restart space

6.2 Access Secrets in Code

import os

api_key = os.getenv('ACLARADOR_API_KEY')

Step 7: Make Space Public/Private

7.1 Public Space (Recommended)

- Anyone can view
- Good for showcasing
- Free forever

7.2 Private Space

1. Go to Settings
2. Change visibility to "Private"
3. Only you can access
4. Still free!

🎉 You're Live!

Your Madrid Content Analyzer is now running FREE forever on Hugging Face Spaces!

Your Space URL:

https://huggingface.co/spaces/YOUR_USERNAME/madrid-content-analyzer

Share it:

https://YOUR_USERNAME-madrid-content-analyzer.hf.space

🔧 Post-Deployment Configuration

Update Fetch Frequency

Edit app.py line ~35:

scheduler.add_job(
    fetch_and_analyze_content,
    'interval',
    hours=6,  # Change this! (1, 6, 12, 24)
    id='content_fetch'
)

Commit and push:

git add app.py
git commit -m "Update fetch frequency"
git push

Update Data Retention

Add cleanup job in app.py:

scheduler.add_job(
    cleanup_old_data,
    'interval',
    days=7,  # Run weekly
    id='cleanup'
)

📊 Monitoring Your Space

Check Logs

1. Go to your Space
2. Click "App files" → "Logs"
3. See real-time logs

Check Database Size

1. Go to "Settings" tab in your app
2. Click "Refresh Database Stats"
3. See "Database Size" in MB

Space is at 16GB limit?

1. Go to Settings tab
2. Run cleanup manually
3. Or decrease data retention period

🔄 Updating Your Space

Update Code

# Make changes locally
# Edit files

# Commit and push
git add .
git commit -m "Update: your changes"
git push

# Space rebuilds automatically!

Update Dependencies

# Edit requirements.txt
# Add/update packages

# Commit and push
git add requirements.txt
git commit -m "Update dependencies"
git push

🆘 Troubleshooting

Space won't start

Check logs: Look for errors in Space logs Common issues:

Missing dependencies in requirements.txt
Import errors in code
Database permission issues

Solution:

# Check requirements.txt has all deps
pip install -r requirements.txt  # Test locally first

# Check imports work
python app.py  # Test locally

Database not persisting

Issue: Data disappears after restart Solution: Make sure using /data/ directory

DB_PATH = '/data/madrid.duckdb'  # ✅ Correct
DB_PATH = 'madrid.duckdb'        # ❌ Wrong (ephemeral!)

Scheduler not running

Issue: No automatic fetches Check: Background scheduler started

scheduler.start()  # Make sure this is called!

Out of memory

Issue: Space crashes with memory error Solution:

Reduce fetch batch size
Add pagination to queries
Upgrade to better hardware (paid)

Import errors

Issue: Can't import Aclarador Solution:

# Check your analyzer structure
analyzers/
  aclarador/
    __init__.py  # Make sure this exists!
    your_code.py

💡 Pro Tips

Tip 1: Test Locally First

# Before pushing, test locally
python app.py

# Visit http://localhost:7860
# Make sure everything works!

Tip 2: Use .gitignore

Create .gitignore:

__pycache__/
*.pyc
.env
*.duckdb
.DS_Store

Tip 3: Add Status Badge

Add to your Space README.md:

![Space Status](https://huggingface.co/spaces/YOUR_USERNAME/madrid-content-analyzer/badge.svg)

Tip 4: Monitor Resource Usage

HF Spaces shows CPU/Memory usage in Space settings

Tip 5: Version Your Data

Before major changes:

# Export data
export_data('csv')

# Make changes

# Can restore if needed

📈 Scaling Up (If Needed)

If You Outgrow Free Tier

Paid Hardware Options:

CPU Upgrade: $0.03/hour (~$22/month)
Basic GPU: $0.60/hour (~$432/month)

But you probably won't need it!

Free tier handles 100K+ items easily
DuckDB is very efficient
16GB is plenty

✅ Deployment Checklist

Before Deployment

Hugging Face account created
Access token generated
Space created
Code copied to local clone
Aclarador integrated
Tested locally

During Deployment

All files committed
Pushed to Hugging Face
Build successful
App starts without errors

After Deployment

Dashboard loads
Manual fetch works
Data persists
Scheduler running
Analysis working

Post-Launch

Set visibility (public/private)
Share URL
Monitor first few fetches
Check database size
Verify automatic fetches

🎊 You Did It!

You now have a completely free Madrid Content Analyzer running 24/7!

What you saved:

Heroku: $84-168/year
Server costs: $0/month
Database: $0/month (vs $5-60/month)

What you got:

Modern Gradio interface
Fast DuckDB analytics
16GB storage
Always-on service
Beautiful visualizations

📞 Need Help?

Hugging Face Community:

Forums: https://discuss.huggingface.co
Discord: https://hf.co/join/discord
Documentation: https://huggingface.co/docs/hub/spaces

Check Your Space Logs:

App files → Logs
See errors in real-time

Congratulations! You're now running on Hugging Face Spaces! 🎉

Next: Share your Space URL and start analyzing Madrid's content!