File size: 2,259 Bytes
c110d02
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
---
license: mit
language:
- en
metrics:
- precision
- recall
- f1
- accuracy
pipeline_tag: text-classification
tags:
- classification
- security
---

# Model Card for Infinitode/SMCM-OPEN-ARC

Repository: https://github.com/Infinitode/OPEN-ARC/

## Model Description

OPEN-ARC-SMC is a MultinomialNB model developed as part of Infinitode's OPEN-ARC initiative. It was created to categorize text, particularly emails, as either spam or legitimate (ham).

**Architecture**:

- **MultinomialNB**: Used default parameters.
- **Framework**: SKLearn.
- **Training Setup**: Trained using default params.

## Uses

- Determining whether emails or SMS are spam or legitimate.
- Enhancing research and developing defensive measures against spammers.

## Limitations

Emails or SMS may be classified as false positives or false negatives because of the nature of the data and its inherent limitations.

## Training Data

- Dataset: Spam Mail Classifier Dataset dataset from Kaggle.
- Source URL: https://www.kaggle.com/datasets/mosapabdelghany/spam-mail-classifier/
- Content: Messages categorized as either spam or ham (legitimate emails or SMS).
- Size: 1000 email/SMS messages labeled as spam or ham.
- Preprocessing: The preprocessing steps included removing missing values and converting text into vectors.

## Training Procedure

- Metrics: accuracy, precision, recall, F1
- Train/Testing Split: 80% train, 20% testing.

## Evaluation Results

| Metric | Value |
| ------ | ----- |
| Testing Accuracy | 98.48% |
| Testing Precision (`spam`) | 96.15% |
| Testing Recall (`spam`) | 93.17% |
| Testing F1 (`spam`) | 94.64% |

## How to Use

```python
new_emails = [
    "Congratulations! You've won a free prize. Click the link to claim.", # Likely spam
    "Hi, just confirming our meeting for tomorrow at 10 AM. Thanks." # Likely not spam
]

# Vectorize the new emails using the fitted vectorizer
new_emails_vectorized = vectorizer.transform(new_emails)

# Make predictions
predictions = model.predict(new_emails_vectorized)

for i, email in enumerate(new_emails):
    print(f"\nEmail: '{email}'")
    print(f"Prediction: {predictions[i]}")
```

## Contact

For questions or issues, open a GitHub issue or reach out at https://infinitode.netlify.app/forms/contact.