amba370447
/

vn-comment-filter-kids

+---
+language: vi
+license: mit
+base_model: NlpHUST/electra-base-vn
+tags:
+- vietnamese
+- content-moderation
+- text-classification
+- kids-safety
+datasets:
+- ViHSD
+metrics:
+- accuracy
+- f1
+pipeline_tag: text-classification
+---
+# 🛡️ Vietnamese Comment Filter for Kids Platform
+Model lọc bình luận tiếng Việt được thiết kế đặc biệt cho nền tảng đọc báo dành cho trẻ em.
+## 📋 Thông tin Model
+- **Kiến trúc**: ELECTRA-base-vn + BiGRU + Attention
+- **Task**: Binary Classification (CLEAN vs TOXIC)
+- **Hybrid System**: ELECTRA + Logistic Regression specialists
+- **Dataset**: ViHSD (Vietnamese Hate Speech Detection)
+## 🎯 Mục đích
+Model này được phát triển để:
+- Tự động kiểm duyệt bình luận trên nền tảng đọc báo cho trẻ em
+- Phát hiện nội dung không phù hợp (toxic, hate speech, NSFW)
+- Bảo vệ trẻ em khỏi nội dung có hại
+## 🏗️ Kiến trúc 3 Tầng
+### Stage 1: NSFW Rule-based Filter
+Phát hiện nhanh các từ ngữ tục tĩu, dâm ô bằng rule-based matching.
+### Stage 2: ELECTRA Binary Gatekeeper
+Phân loại CLEAN vs TOXIC bằng deep learning model.
+### Stage 3: Twin-Logistic Specialist
+- **Navigator**: Phân biệt OFFENSIVE vs HATE
+- **Power Gauge**: Tính xác suất toxic chính xác
+## 📊 Hiệu suất
+- **Accuracy**: 87.4%
+- **F1-Score (Macro)**: 0.846
+- **Inference Time**: ~15-30ms/comment
+## 💡 Cách sử dụng
+### Option 1: Sử dụng qua API (Đơn giản nhất - Recommended)
+```bash
+# Chạy FastAPI server
+python api_server.py
+# Test bằng curl
+curl -X POST "http://localhost:8000/filter" \
+  -H "Content-Type: application/json" \
+  -d '{"text": "Sản phẩm tốt", "strict_mode": true}'
+```
+### Option 2: Load trực tiếp trong Python
+```python
+from transformers import AutoTokenizer, AutoModel
+import torch
+# Load model
+model = AutoModel.from_pretrained("your-username/vn-comment-filter-kids")
+tokenizer = AutoTokenizer.from_pretrained("your-username/vn-comment-filter-kids")
+# Inference
+text = "Comment cần kiểm tra"
+inputs = tokenizer(text, return_tensors="pt", max_length=256, padding=True, truncation=True)
+with torch.no_grad():
+    outputs = model(**inputs)
+    prediction = torch.argmax(outputs.logits, dim=-1).item()
+print("CLEAN" if prediction == 0 else "TOXIC")
+```
+## ⚙️ Configuration
+Model hỗ trợ 2 modes:
+- **Normal Mode**: Threshold = 0.60 (balanced)
+- **Strict Mode**: Threshold = 0.45 (safer for kids, more false positives)
+## 📦 Files trong Repository
+```
+.
+├── model.safetensors              # Model weights
+├── config.json                    # Model config
+├── tokenizer files               # Tokenizer artifacts
+├── preprocessing_config.json      # Text preprocessing rules
+├── specialist_navigator.pkl       # Logistic model cho OFF vs HATE
+├── specialist_power.pkl           # Logistic model cho toxicity prob
+├── requirements.txt              # Python dependencies
+└── README.md                     # Documentation này
+```
+## ⚠️ Limitations
+- Model được train trên ViHSD dataset (social media comments)
+- Có thể có false positives với ngữ cảnh đặc biệt
+- Không phù hợp cho các domain khác ngoài content moderation
+## 📄 License
+MIT License - Free for educational and commercial use
+## 🙏 Credits
+- Base model: [NlpHUST/electra-base-vn](https://huggingface.co/NlpHUST/electra-base-vn)
+- Dataset: ViHSD (Vietnamese Hate Speech Detection)
+- Framework: Hugging Face Transformers, PyTorch
+## 📧 Contact
+For questions or issues, please open an issue on the repository.