madriClaro / ACLARADOR_INTEGRATION.md
Ruben
Integrate Aclarador with Groq API for clarity analysis
28aa7d9

A newer version of the Gradio SDK is available: 6.0.1

Upgrade

Aclarador Integration Complete βœ…

What Was Done

Successfully integrated Aclarador's clarity analysis system using Groq API.

Files Modified

  1. requirements.txt - Added groq dependency
  2. analyzers/analyzer_wrapper.py - Complete rewrite to use Groq API
  3. analyzers/aclarador/ - Cloned from https://github.com/menpente/aclarador-clean

How It Works

Architecture

The analyzer uses a two-mode approach:

  1. Groq API Mode (when GROQ_API_KEY is set)

    • Loads system prompt from analyzers/aclarador/system_prompt.md
    • Sends text to Groq's llama-3.3-70b-versatile model
    • Parses AI response for issues, suggestions, and improvements
    • Calculates clarity scores based on detected issues
  2. Fallback Mode (when API key not set)

    • Uses simple heuristics (sentence length, word length)
    • Provides basic scoring without AI analysis
    • Useful for testing and development

Analysis Flow

Text Input
    ↓
Load System Prompt (Spanish clarity principles)
    ↓
Send to Groq API with temperature=0.3
    ↓
Parse Response:
  - Extract corrected text
  - Identify issues (long sentences, complex vocab, passive voice)
  - Extract improvement suggestions
    ↓
Calculate Scores:
  - Overall Score (0-100)
  - Readability Score (based on sentence/word length)
  - Complexity Score (inverse of detected issues)
    ↓
Return Analysis Result

Configuration

For Local Development

Set environment variable:

export GROQ_API_KEY="your-groq-api-key-here"

For Hugging Face Spaces

  1. Go to your Space Settings
  2. Navigate to "Repository secrets"
  3. Add new secret:
    • Name: GROQ_API_KEY
    • Value: Your Groq API key
  4. Restart the Space

Getting a Groq API Key

  1. Visit https://console.groq.com
  2. Sign up for a free account
  3. Go to API Keys section
  4. Create a new API key
  5. Copy the key (keep it secure!)

Note: Groq offers a generous free tier suitable for this project.

Testing

Test Without API Key (Fallback Mode)

python3 -c "
from analyzers.analyzer_wrapper import AclaradorAnalyzer

analyzer = AclaradorAnalyzer()
result = analyzer.analyze('El Ayuntamiento de Madrid establece nuevas normativas.')
print(f'Score: {result[\"overall_score\"]:.1f}')
print(f'Suggestions: {result[\"suggestions\"]}')
"

Test With API Key

export GROQ_API_KEY="your-key-here"

python3 -c "
from analyzers.analyzer_wrapper import AclaradorAnalyzer

analyzer = AclaradorAnalyzer()
result = analyzer.analyze('El Ayuntamiento de Madrid establece nuevas normativas.')
print(f'Score: {result[\"overall_score\"]:.1f}')
print(f'Issues: {result[\"readability_metrics\"][\"issues_detected\"]}')
print(f'Suggestions: {result[\"suggestions\"]}')
"

What the Analyzer Detects

Based on Spanish clarity principles from the system prompt:

Issues Detected

  • Long sentences (>30 words)
  • Complex vocabulary (technical jargon, long words)
  • Passive voice (voz pasiva)
  • Redundancies (repetitive phrases)
  • Verbose style (excessive wordiness)

Scoring System

  • Overall Score: Weighted average of readability and complexity
  • Readability Score: Based on sentence/word length (optimal: 15-20 word sentences)
  • Complexity Score: Inversely related to number of issues detected

Output Includes

  • Clarity scores (0-100, higher = clearer)
  • Sentence statistics (count, average length, long sentences)
  • Vocabulary statistics (total words, unique words, lexical diversity)
  • Detected jargon words
  • Specific improvement suggestions
  • Corrected text (from Groq analysis)

System Prompt

The analyzer uses comprehensive Spanish clarity guidelines from system_prompt.md:

  • Sentence Structure: Max 30 words, one idea per sentence
  • Active Voice: Prefer "El departamento aprobΓ³" over "Fue aprobado por"
  • Clear Vocabulary: Avoid unnecessary jargon
  • Effective Punctuation: Use periods, commas, colons appropriately
  • Digital Adaptation: Optimize for web reading and SEO

Cost Considerations

  • Groq API: Free tier with generous limits
  • Alternative: Fallback mode requires no API key (reduced accuracy)
  • Hugging Face Spaces: Completely free hosting

Next Steps

  1. Deploy to HF Spaces:

    git add .
    git commit -m "Integrate Aclarador with Groq API"
    git push
    
  2. Add API Key to Space secrets (Settings β†’ Repository secrets)

  3. Test by triggering manual fetch in Settings tab

  4. Monitor analysis quality in Dashboard

Troubleshooting

"Groq API not available"

  • Install: pip install groq
  • Or add to requirements.txt (already done)

"GROQ_API_KEY not found"

  • Set environment variable locally
  • Or add to HF Spaces secrets

"Using fallback analysis"

  • This is normal when API key is not configured
  • Analysis still works with basic heuristics
  • Set GROQ_API_KEY for full AI-powered analysis

Rate Limits

  • Groq free tier has limits (check console.groq.com)
  • Consider spacing out fetches if hitting limits
  • Monitor logs in HF Spaces

Files Structure

analyzers/
β”œβ”€β”€ analyzer_wrapper.py          # Main integration (uses Groq)
β”œβ”€β”€ aclarador/
β”‚   β”œβ”€β”€ system_prompt.md         # Spanish clarity guidelines
β”‚   β”œβ”€β”€ agent_coordinator.py     # (Not used - simplified approach)
β”‚   β”œβ”€β”€ agents/                  # (Not used - simplified approach)
β”‚   └── app.py                   # Original Streamlit app (reference)

Success! πŸŽ‰

The integration is complete and ready to deploy. The analyzer will:

  • βœ… Use Groq API for intelligent clarity analysis
  • βœ… Fall back to heuristics when API unavailable
  • βœ… Provide Spanish-specific clarity recommendations
  • βœ… Generate actionable improvement suggestions
  • βœ… Work seamlessly with the existing Madrid Analyzer pipeline