--- title: Multilingual Hate Speech Detector emoji: πŸ›‘οΈ colorFrom: red colorTo: blue sdk: gradio sdk_version: 4.44.0 app_file: app.py pinned: false license: mit short_description: Hate speech detector models: - xlm-roberta-base datasets: - hate-speech --- # πŸ›‘οΈ Multilingual Hate Speech Detector **Advanced AI system for detecting hate speech in English and Serbian text with innovative contextual analysis** ## πŸ”¬ Key Innovations ### 1. **Contextual Analysis** 🌈 - **Word-level importance highlighting** using transformer attention weights - Visual explanation showing which words most influenced the classification decision - Color-coded highlighting: πŸ”΄ Red (high influence) β†’ 🟠 Orange β†’ 🟑 Yellow β†’ βšͺ Gray (low influence) ### 2. **Confidence Visualization** πŸ“Š - Interactive Plotly charts showing model confidence across **all 8 categories** - Real-time confidence distribution analysis - Color-coded bars distinguishing hate speech categories from appropriate content ### 3. **Interactive Feedback System** πŸ’¬ - User rating system (1-5 stars) for continuous model improvement - Feedback collection for enhancing accuracy - Community-driven model refinement ## πŸ“‹ Hate Speech Categories The system detects 8 categories: - **Race**: Racial discrimination and slurs - **Sexual Orientation**: Homophobic content, LGBTQ+ discrimination - **Gender**: Sexist content, misogyny, gender-based harassment - **Physical Appearance**: Body shaming, lookism, appearance-based harassment - **Religion**: Religious discrimination, islamophobia, antisemitism - **Class**: Classist content, economic discrimination - **Disability**: Ableist content, discrimination against disabled people - **Appropriate**: Non-hateful, normal conversation ## 🌍 Multilingual Support - **English**: Comprehensive hate speech detection - **Serbian**: Native Serbian language support with Cyrillic and Latin scripts - **Cross-lingual**: XLM-RoBERTa architecture enables robust multilingual understanding ## πŸ”§ Technical Architecture - **Base Model**: XLM-RoBERTa (Cross-lingual Language Model) - **Training**: Fine-tuned on multilingual hate speech datasets - **Attention Mechanism**: Transformer attention weights for explainable AI - **Real-time Processing**: Optimized for instant classification - **GPU Acceleration**: CUDA support for faster inference ## πŸš€ How to Use 1. **Input Text**: Enter any text in English or Serbian 2. **Analyze**: Click "Analyze Text" for instant classification 3. **Review Results**: See category prediction with confidence score 4. **Examine Context**: Check word-level highlighting to understand the decision 5. **View Confidence**: Analyze the confidence distribution chart 6. **Provide Feedback**: Rate the analysis to help improve the model ## 🎯 Example Analyses ### Appropriate Content ``` "I really enjoyed that movie last night! Great acting and storyline." β†’ βœ… Appropriate (95% confidence) ``` ### Hate Speech Detection ``` "You people are all the same, always causing problems everywhere." β†’ ⚠️ Race (87% confidence) ``` ### Serbian Language ``` "Ovaj film je bio odličan, preporučujem svima!" β†’ βœ… Appropriate (92% confidence) ``` ## ⚑ Performance - **Accuracy**: High-confidence predictions with detailed explanations - **Speed**: Real-time processing (< 2 seconds per analysis) - **Languages**: English and Serbian with cross-lingual capabilities - **Explainability**: Visual attention analysis for transparent decisions ## πŸ› οΈ Local Development ```bash # Clone the repository git clone cd hate-speech-detector # Install dependencies pip install -r requirements.txt # Run the application python app.py ``` ## πŸ“ Research & Education This AI system is designed for: - **Research purposes**: Understanding hate speech patterns - **Educational use**: Learning about AI explainability - **Content moderation**: Assisting human moderators - **Linguistic analysis**: Cross-lingual hate speech research ## ⚠️ Important Notes - Results should be interpreted carefully - Human judgment should always be applied for critical decisions - The system is designed to assist, not replace, human moderation - Continuous improvement through user feedback ## 🀝 Contributing We welcome feedback and contributions! Please use the interactive feedback system within the application to help improve model accuracy. ## πŸ“„ License MIT License - See LICENSE file for details --- **⚑ Powered by**: Transformer Neural Networks | **🌍 Languages**: English, Serbian | **🎯 Focus**: Explainable AI