Spaces:

rdlf
/

madriClaro

Sleeping

App Files Files Community

madriClaro / ACLARADOR_INTEGRATION.md

Ruben

Integrate Aclarador with Groq API for clarity analysis

28aa7d9 about 1 month ago

preview code

raw

history blame contribute delete

5.89 kB

A newer version of the Gradio SDK is available: 6.0.1

Upgrade

Aclarador Integration Complete ✅

What Was Done

Successfully integrated Aclarador's clarity analysis system using Groq API.

Files Modified

requirements.txt - Added groq dependency
analyzers/analyzer_wrapper.py - Complete rewrite to use Groq API
analyzers/aclarador/ - Cloned from https://github.com/menpente/aclarador-clean

How It Works

Architecture

The analyzer uses a two-mode approach:

Groq API Mode (when GROQ_API_KEY is set)
- Loads system prompt from analyzers/aclarador/system_prompt.md
- Sends text to Groq's llama-3.3-70b-versatile model
- Parses AI response for issues, suggestions, and improvements
- Calculates clarity scores based on detected issues
Fallback Mode (when API key not set)
- Uses simple heuristics (sentence length, word length)
- Provides basic scoring without AI analysis
- Useful for testing and development

Analysis Flow

Text Input
    ↓
Load System Prompt (Spanish clarity principles)
    ↓
Send to Groq API with temperature=0.3
    ↓
Parse Response:
  - Extract corrected text
  - Identify issues (long sentences, complex vocab, passive voice)
  - Extract improvement suggestions
    ↓
Calculate Scores:
  - Overall Score (0-100)
  - Readability Score (based on sentence/word length)
  - Complexity Score (inverse of detected issues)
    ↓
Return Analysis Result

Configuration

For Local Development

Set environment variable:

export GROQ_API_KEY="your-groq-api-key-here"

For Hugging Face Spaces

Go to your Space Settings
Navigate to "Repository secrets"
Add new secret:
- Name: GROQ_API_KEY
- Value: Your Groq API key
Restart the Space

Getting a Groq API Key

Visit https://console.groq.com
Sign up for a free account
Go to API Keys section
Create a new API key
Copy the key (keep it secure!)

Note: Groq offers a generous free tier suitable for this project.

Testing

Test Without API Key (Fallback Mode)

python3 -c "
from analyzers.analyzer_wrapper import AclaradorAnalyzer

analyzer = AclaradorAnalyzer()
result = analyzer.analyze('El Ayuntamiento de Madrid establece nuevas normativas.')
print(f'Score: {result[\"overall_score\"]:.1f}')
print(f'Suggestions: {result[\"suggestions\"]}')
"

Test With API Key

export GROQ_API_KEY="your-key-here"

python3 -c "
from analyzers.analyzer_wrapper import AclaradorAnalyzer

analyzer = AclaradorAnalyzer()
result = analyzer.analyze('El Ayuntamiento de Madrid establece nuevas normativas.')
print(f'Score: {result[\"overall_score\"]:.1f}')
print(f'Issues: {result[\"readability_metrics\"][\"issues_detected\"]}')
print(f'Suggestions: {result[\"suggestions\"]}')
"

What the Analyzer Detects

Based on Spanish clarity principles from the system prompt:

Issues Detected

Long sentences (>30 words)
Complex vocabulary (technical jargon, long words)
Passive voice (voz pasiva)
Redundancies (repetitive phrases)
Verbose style (excessive wordiness)

Scoring System

Overall Score: Weighted average of readability and complexity
Readability Score: Based on sentence/word length (optimal: 15-20 word sentences)
Complexity Score: Inversely related to number of issues detected

Output Includes

Clarity scores (0-100, higher = clearer)
Sentence statistics (count, average length, long sentences)
Vocabulary statistics (total words, unique words, lexical diversity)
Detected jargon words
Specific improvement suggestions
Corrected text (from Groq analysis)

System Prompt

The analyzer uses comprehensive Spanish clarity guidelines from system_prompt.md:

Sentence Structure: Max 30 words, one idea per sentence
Active Voice: Prefer "El departamento aprobó" over "Fue aprobado por"
Clear Vocabulary: Avoid unnecessary jargon
Effective Punctuation: Use periods, commas, colons appropriately
Digital Adaptation: Optimize for web reading and SEO

Cost Considerations

Groq API: Free tier with generous limits
Alternative: Fallback mode requires no API key (reduced accuracy)
Hugging Face Spaces: Completely free hosting

Next Steps

Deploy to HF Spaces:

git add .
git commit -m "Integrate Aclarador with Groq API"
git push

Add API Key to Space secrets (Settings → Repository secrets)
Test by triggering manual fetch in Settings tab
Monitor analysis quality in Dashboard

Troubleshooting

"Groq API not available"

Install: pip install groq
Or add to requirements.txt (already done)

"GROQ_API_KEY not found"

Set environment variable locally
Or add to HF Spaces secrets

"Using fallback analysis"

This is normal when API key is not configured
Analysis still works with basic heuristics
Set GROQ_API_KEY for full AI-powered analysis

Rate Limits

Groq free tier has limits (check console.groq.com)
Consider spacing out fetches if hitting limits
Monitor logs in HF Spaces

Files Structure

analyzers/
├── analyzer_wrapper.py          # Main integration (uses Groq)
├── aclarador/
│   ├── system_prompt.md         # Spanish clarity guidelines
│   ├── agent_coordinator.py     # (Not used - simplified approach)
│   ├── agents/                  # (Not used - simplified approach)
│   └── app.py                   # Original Streamlit app (reference)

Success! 🎉

The integration is complete and ready to deploy. The analyzer will:

✅ Use Groq API for intelligent clarity analysis
✅ Fall back to heuristics when API unavailable
✅ Provide Spanish-specific clarity recommendations
✅ Generate actionable improvement suggestions
✅ Work seamlessly with the existing Madrid Analyzer pipeline