Upload folder using huggingface_hub

Browse files

Files changed (4) hide show

README.md +80 -0
config.json +23 -0
label_encoders.pkl +3 -0
pytorch_model.bin +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,80 @@

+---
+language:
+- en
+- hi
+license: apache-2.0
+tags:
+- hate-speech-detection
+- reddit
+- xlm-roberta
+- hindi
+- english
+datasets:
+- HASOC2019
+metrics:
+- accuracy
+- f1
+model-index:
+- name: reddit-hate-speech-detector
+  results:
+  - task:
+      type: text-classification
+    metrics:
+    - type: accuracy
+      value: 0.8293
+    - type: f1
+      value: 0.8278
+---
+# Reddit Hate Speech Detector (Hindi + English)
+This model detects hate speech in Reddit comments for both Hindi and English languages.
+## Model Description
+- **Base Model:** XLM-RoBERTa
+- **Languages:** Hindi, English
+- **Task:** Multi-task classification (hate speech detection + type + target)
+- **Accuracy:** 82.93%
+- **F1 Score:** 0.8278
+## Intended Use
+This model is designed for:
+- Content moderation on Reddit
+- Automated hate speech detection
+- Research purposes
+⚠️ **Important:** This model should assist human moderators, not replace them.
+## Usage
+```python
+import torch
+from transformers import XLMRobertaTokenizer
+# Load tokenizer
+tokenizer = XLMRobertaTokenizer.from_pretrained('xlm-roberta-base')
+# Your model loading code here
+# (See inference script)
+```
+## Training Data
+- HASOC 2019 Hindi Dataset
+- HASOC 2019 English Dataset
+- Combined training with class balancing
+## Limitations
+- May have biases present in training data
+- Requires context for accurate detection
+- Cultural nuances may not be fully captured
+## Ethical Considerations
+- Should be used transparently
+- Allow user appeals
+- Regular monitoring for fairness
+- Consider cultural context

config.json ADDED Viewed

	@@ -0,0 +1,23 @@

+{
+  "model_type": "xlm-roberta",
+  "num_task1_classes": 2,
+  "num_task2_classes": 4,
+  "num_task3_classes": 3,
+  "dropout": 0.2,
+  "base_model": "xlm-roberta-base",
+  "task_1_labels": [
+    "HOF",
+    "NOT"
+  ],
+  "task_2_labels": [
+    "HATE",
+    "NONE",
+    "OFFN",
+    "PRFN"
+  ],
+  "task_3_labels": [
+    "NONE",
+    "TIN",
+    "UNT"
+  ]
+}

label_encoders.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7f7e5d3a775c7a57ce504c49d73c79d6398947bbf674fe79974fe6e661ad2190
+size 398

pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:cc8d188d80793c4ef856d172c34f63ea79aff1f5a57abb2538b4e0a9b60932b0
+size 1115855655