InfinitodeLTD
/

SMCM-OPEN-ARC

Text Classification

Model card Files Files and versions

SMCM-OPEN-ARC / README.md

JohanBeytell's picture

Update README.md

c110d02 verified about 1 month ago

|

history blame contribute delete

2.26 kB

	---
	license: mit
	language:
	- en
	metrics:
	- precision
	- recall
	- f1
	- accuracy
	pipeline_tag: text-classification
	tags:
	- classification
	- security
	---

	# Model Card for Infinitode/SMCM-OPEN-ARC

	Repository: https://github.com/Infinitode/OPEN-ARC/

	## Model Description

	OPEN-ARC-SMC is a MultinomialNB model developed as part of Infinitode's OPEN-ARC initiative. It was created to categorize text, particularly emails, as either spam or legitimate (ham).

	Architecture:

	- MultinomialNB: Used default parameters.
	- Framework: SKLearn.
	- Training Setup: Trained using default params.

	## Uses

	- Determining whether emails or SMS are spam or legitimate.
	- Enhancing research and developing defensive measures against spammers.

	## Limitations

	Emails or SMS may be classified as false positives or false negatives because of the nature of the data and its inherent limitations.

	## Training Data

	- Dataset: Spam Mail Classifier Dataset dataset from Kaggle.
	- Source URL: https://www.kaggle.com/datasets/mosapabdelghany/spam-mail-classifier/
	- Content: Messages categorized as either spam or ham (legitimate emails or SMS).
	- Size: 1000 email/SMS messages labeled as spam or ham.
	- Preprocessing: The preprocessing steps included removing missing values and converting text into vectors.

	## Training Procedure

	- Metrics: accuracy, precision, recall, F1
	- Train/Testing Split: 80% train, 20% testing.

	## Evaluation Results

	\| Metric \| Value \|
	\| ------ \| ----- \|
	\| Testing Accuracy \| 98.48% \|
	\| Testing Precision (`spam`) \| 96.15% \|
	\| Testing Recall (`spam`) \| 93.17% \|
	\| Testing F1 (`spam`) \| 94.64% \|

	## How to Use

	```python
	new_emails = [
	"Congratulations! You've won a free prize. Click the link to claim.", # Likely spam
	"Hi, just confirming our meeting for tomorrow at 10 AM. Thanks." # Likely not spam
	]

	# Vectorize the new emails using the fitted vectorizer
	new_emails_vectorized = vectorizer.transform(new_emails)

	# Make predictions
	predictions = model.predict(new_emails_vectorized)

	for i, email in enumerate(new_emails):
	print(f"\nEmail: '{email}'")
	print(f"Prediction: {predictions[i]}")
	```

	## Contact

	For questions or issues, open a GitHub issue or reach out at https://infinitode.netlify.app/forms/contact.