citizenlab
/

distilbert-base-multilingual-cased-toxicity

Text Classification

Model card Files Files and versions

arianpasquali commited on Dec 2, 2022

Commit

a999ecb

·

1 Parent(s): b9f9177

Create README.md

Files changed (1) hide show

README.md +59 -0

README.md ADDED Viewed

	@@ -0,0 +1,59 @@

+---
+pipeline_type: "text-classification"
+widget:
+- text: "this is a lovely message"
+  example_title: "Example 1"
+  multi_class: false
+- text: "you are an idiot and you and your family should go back to your country"
+  example_title: "Example 2"
+  multi_class: false
+language:
+  - en
+  - nl
+  - fr
+  - pt
+  - it
+  - es
+  - de
+  - da
+  - pl
+  - af
+datasets:
+- jigsaw_toxicity_pred
+metrics:
+- F1 Accuracy
+---
+# citizenlab/distilbert-base-multilingual-cased-toxicity
+This is multilingual Distil-Bert model sequence classifier trained based on [JIGSAW Toxic Comment Classification Challenge](https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge) dataset.
+## How to use it
+```python
+from transformers import pipeline
+model_path = "citizenlab/distilbert-base-multilingual-cased-toxicity"
+topic_classifier = pipeline("text-classification", model=model_path, tokenizer=model_path)
+topic_classifier("this is a lovely message")
+> [{'label': 'not_toxic', 'score': 0.9954179525375366}]
+topic_classifier("you are an idiot and you and your family should go back to your country")
+> [{'label': 'toxic', 'score': 0.9948776960372925}]
+```
+## Evaluation
+### Accuracy
+```
+  Accuracy Score = 0.9425
+F1 Score (Micro) = 0.9450549450549449
+F1 Score (Macro) = 0.8491432341169309
+```