---
language: en
datasets:
  - jigsaw-toxic-comment-classification-challenge
tags:
  - text-classification
  - multi-label-classification
  - toxicity-detection
  - bert
  - transformers
  - pytorch
license: apache-2.0
model-index:
  - name: BERT Multi-label Toxic Comment Classifier
    results:
      - task:
          name: Multi-label Text Classification
          type: multi-label-classification
        dataset:
          name: Jigsaw Toxic Comment Classification Challenge
          type: jigsaw-toxic-comment-classification-challenge
        metrics:
          - name: Accuracy
            type: accuracy
            value:  0.9187 # Replace with your actual score
---

# BERT Multi-label Toxic Comment Classifier

This model is a fine-tuned [`bert-base-uncased`](https://huggingface.co/bert-base-uncased) transformer for **multi-label classification** on the [Jigsaw Toxic Comment Classification Challenge](https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge) dataset.

It predicts multiple toxicity-related labels per comment, including:
- toxicity
- severe toxicity
- obscene
- threat
- insult
- identity attack
- sexual explicit

## Model Details

- **Base Model**: `bert-base-uncased`
- **Task**: Multi-label text classification
- **Dataset**: Jigsaw Toxic Comment Classification Challenge (processed version)
- **Labels**: 7 toxicity-related categories
- **Training Epochs**: 2
- **Batch Size**: 16 (train), 64 (eval)
- **Metrics**: Accuracy, Macro F1, Precision, Recall

## Usage

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("Koushim/bert-multilabel-jigsaw-toxic-classifier")
model = AutoModelForSequenceClassification.from_pretrained("Koushim/bert-multilabel-jigsaw-toxic-classifier")

text = "You are a wonderful person!"
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=128)
outputs = model(**inputs)

# Sigmoid to get probabilities for each label
import torch
probs = torch.sigmoid(outputs.logits)
print(probs)
````

## Labels

| Index | Label            |
| ----- | ---------------- |
| 0     | toxicity         |
| 1     | severe_toxicity |
| 2     | obscene          |
| 3     | threat           |
| 4     | insult           |
| 5     | identity_attack |
| 6     | sexual_explicit |

## Training Details

* Training Set: Full dataset (160k+ samples)
* Loss Function: Binary Cross Entropy (via `BertForSequenceClassification` with `problem_type="multi_label_classification"`)
* Optimizer: AdamW
* Learning Rate: 2e-5
* Evaluation Strategy: Epoch-based evaluation with early stopping on F1 score
* Model Framework: PyTorch with Hugging Face Transformers

## Repository Contents

* `pytorch_model.bin` - trained model weights
* `config.json` - model configuration
* `tokenizer.json`, `vocab.txt` - tokenizer files
* `README.md` - this file

## How to Fine-tune or Train

You can fine-tune this model using the Hugging Face `Trainer` API with your own dataset or the original Jigsaw dataset.

## Citation

If you use this model in your research or project, please cite:

```
@article{devlin2019bert,
  title={BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding},
  author={Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina},
  journal={arXiv preprint arXiv:1810.04805},
  year={2019}
}
```

## License

Apache 2.0 License