🛡️ Vietnamese Comment Filter for Kids Platform

Model lọc bình luận tiếng Việt được thiết kế đặc biệt cho nền tảng đọc báo dành cho trẻ em.

📋 Thông tin Model

Kiến trúc: ELECTRA-base-vn + BiGRU + Attention
Task: Binary Classification (CLEAN vs TOXIC)
Hybrid System: ELECTRA + Logistic Regression specialists
Dataset: ViHSD (Vietnamese Hate Speech Detection)

🎯 Mục đích

Model này được phát triển để:

Tự động kiểm duyệt bình luận trên nền tảng đọc báo cho trẻ em
Phát hiện nội dung không phù hợp (toxic, hate speech, NSFW)
Bảo vệ trẻ em khỏi nội dung có hại

🏗️ Kiến trúc 3 Tầng

Stage 1: NSFW Rule-based Filter

Phát hiện nhanh các từ ngữ tục tĩu, dâm ô bằng rule-based matching.

Stage 2: ELECTRA Binary Gatekeeper

Phân loại CLEAN vs TOXIC bằng deep learning model.

Stage 3: Twin-Logistic Specialist

Navigator: Phân biệt OFFENSIVE vs HATE
Power Gauge: Tính xác suất toxic chính xác

📊 Hiệu suất

Accuracy: 87.4%
F1-Score (Macro): 0.846
Inference Time: ~15-30ms/comment

💡 Cách sử dụng

Option 1: Sử dụng qua API (Đơn giản nhất - Recommended)

# Chạy FastAPI server
python api_server.py

# Test bằng curl
curl -X POST "http://localhost:8000/filter" \
  -H "Content-Type: application/json" \
  -d '{"text": "Sản phẩm tốt", "strict_mode": true}'

Option 2: Load trực tiếp trong Python

from transformers import AutoTokenizer, AutoModel
import torch

# Load model
model = AutoModel.from_pretrained("your-username/vn-comment-filter-kids")
tokenizer = AutoTokenizer.from_pretrained("your-username/vn-comment-filter-kids")

# Inference
text = "Comment cần kiểm tra"
inputs = tokenizer(text, return_tensors="pt", max_length=256, padding=True, truncation=True)

with torch.no_grad():
    outputs = model(**inputs)
    prediction = torch.argmax(outputs.logits, dim=-1).item()

print("CLEAN" if prediction == 0 else "TOXIC")

⚙️ Configuration

Model hỗ trợ 2 modes:

Normal Mode: Threshold = 0.60 (balanced)
Strict Mode: Threshold = 0.45 (safer for kids, more false positives)

📦 Files trong Repository

.
├── model.safetensors              # Model weights
├── config.json                    # Model config
├── tokenizer files               # Tokenizer artifacts
├── preprocessing_config.json      # Text preprocessing rules
├── specialist_navigator.pkl       # Logistic model cho OFF vs HATE
├── specialist_power.pkl           # Logistic model cho toxicity prob
├── requirements.txt              # Python dependencies
└── README.md                     # Documentation này

⚠️ Limitations

Model được train trên ViHSD dataset (social media comments)
Có thể có false positives với ngữ cảnh đặc biệt
Không phù hợp cho các domain khác ngoài content moderation

📄 License

MIT License - Free for educational and commercial use

🙏 Credits

Base model: NlpHUST/electra-base-vn
Dataset: ViHSD (Vietnamese Hate Speech Detection)
Framework: Hugging Face Transformers, PyTorch

📧 Contact

For questions or issues, please open an issue on the repository.

Downloads last month: -

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for amba370447/vn-comment-filter-kids

Base model

NlpHUST/electra-base-vn

Finetuned

(5)

this model