archich
/

reddit-hate-speech-detector

hate-speech-detection

Model card Files Files and versions

Reddit Hate Speech Detector (Hindi + English)

This model detects hate speech in Reddit comments for both Hindi and English languages.

Model Description

Base Model: XLM-RoBERTa
Languages: Hindi, English
Task: Multi-task classification (hate speech detection + type + target)
Accuracy: 82.93%
F1 Score: 0.8278

Intended Use

This model is designed for:

Content moderation on Reddit
Automated hate speech detection
Research purposes

⚠️ Important: This model should assist human moderators, not replace them.

Usage

import torch
from transformers import XLMRobertaTokenizer

# Load tokenizer
tokenizer = XLMRobertaTokenizer.from_pretrained('xlm-roberta-base')

# Your model loading code here
# (See inference script)

Training Data

HASOC 2019 Hindi Dataset
HASOC 2019 English Dataset
Combined training with class balancing

Limitations

May have biases present in training data
Requires context for accurate detection
Cultural nuances may not be fully captured

Ethical Considerations

Should be used transparently
Allow user appeals
Regular monitoring for fairness
Consider cultural context

Downloads last month: 17

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Evaluation results

accuracy
self-reported

0.829
f1
self-reported

0.828

Metadata error: specify a dataset to view leaderboard