verneylmavt
/

bert-base-uncased_sentiment-analysis

+---
+license: apache-2.0
+datasets:
+- stanfordnlp/imdb
+language:
+- en
+base_model:
+- google-bert/bert-base-uncased
+pipeline_tag: text-classification
+tags:
+- IMDB
+- Sentiment Analysis
+---
+# BERT-Based Sentiment Analysis Models
+## Model Description
+This repository contains two versions of BERT-based models fine-tuned for sentiment analysis tasks:
+- **BERT-1**: Fine-tuned on the IMDB movie reviews dataset.
+- **BERT-2**: Fine-tuned on a combined dataset of IMDB movie reviews dataset and Twitter comments.
+Both models are based on the `bert-base-uncased` pre-trained model from Hugging Face's Transformers library.
+## Intended Use
+These models are intended for binary sentiment analysis of English text data. They can be used to classify text into positive or negative sentiment categories.
+### Loading the Models
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+# Load BERT-1
+tokenizer_bert1 = AutoTokenizer.from_pretrained("verneylmavt/bert-base-uncased_sentiment-analysis/bert-1")
+model_bert1 = AutoModelForSequenceClassification.from_pretrained("verneylmavt/bert-base-uncased_sentiment-analysis/bert-1")
+# Load BERT-2
+tokenizer_bert2 = AutoTokenizer.from_pretrained("verneylmavt/bert-base-uncased_sentiment-analysis/bert-2")
+model_bert2 = AutoModelForSequenceClassification.from_pretrained("verneylmavt/bert-base-uncased_sentiment-analysis/bert-2")
+```
+### Performing Sentiment Analysis
+```python
+from transformers import pipeline
+# Initialize pipelines
+sentiment_pipeline_bert1 = pipeline("sentiment-analysis", model=model_bert1, tokenizer=tokenizer_bert1)
+sentiment_pipeline_bert2 = pipeline("sentiment-analysis", model=model_bert2, tokenizer=tokenizer_bert2)
+# Sample text
+text = "I absolutely loved this product! It exceeded my expectations."
+# Get predictions
+result_bert1 = sentiment_pipeline_bert1(text)
+result_bert2 = sentiment_pipeline_bert2(text)
+print("BERT-1 Prediction:", result_bert1)
+print("BERT-2 Prediction:", result_bert2)
+```
+## Training Details
+### BERT-1
+- **Dataset**: [IMDB Movie Reviews Dataset](https://ai.stanford.edu/~amaas/data/sentiment/)
+- **Objective**: Binary sentiment classification (positive/negative)
+- **Optimizer**: AdamW with a learning rate `lr` (value unspecified)
+- **Scheduler**: Linear scheduler with warmup (`get_linear_schedule_with_warmup`)
+- **Epochs**: `num_epochs = 3`
+- **Device**: Trained on GPU if available
+- **Metrics Monitored**: Training loss, training accuracy, testing accuracy per epoch
+### BERT-2
+- **Dataset**:
+  - [IMDB Movie Reviews Dataset](https://ai.stanford.edu/~amaas/data/sentiment/)
+  - [Twitter Comment - Sentiment Analysis Dataset](https://www.kaggle.com/datasets/abhi8923shriv/sentiment-analysis-dataset)
+- **Objective**: Binary sentiment classification (positive/negative)
+- **Optimizer**: AdamW with weight decay (`0.01`) and parameters requiring gradients
+- **Scheduler**: Linear scheduler with warmup (`10%` of total steps)
+- **Gradient Clipping**: Applied with `max_norm=1.0`
+- **Early Stopping**: Implemented with a patience of `2` epochs without improvement in validation loss
+- **Epochs**: `num_epochs = 3`, training may stop early due to early stopping
+- **Device**: Trained on GPU if available
+- **Metrics Monitored**: Training loss, training accuracy, validation loss, validation accuracy per epoch
+## Limitations and Biases
+- **Data Bias**: The models are trained on specific datasets, which may contain inherent biases such as demographic or cultural biases.
+- **Language Support**: Only supports English language text.
+- **Generalization**: Performance may degrade on text significantly different from the training data (e.g., slang, jargon).
+- **Ethical Considerations**: Users should be cautious of potential biases in predictions and should not use the model for critical decisions without human oversight.
+## License
+The models are distributed under the same license as the original `bert-base-uncased` model ([Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0)).
+## Acknowledgements
+- Thanks to the Hugging Face team for providing the Transformers library and model hosting.
+- The IMDB dataset is made available by [Maas et al.](https://ai.stanford.edu/~amaas/data/sentiment/) under a [Creative Commons Attribution-NonCommercial 3.0 Unported License](https://creativecommons.org/licenses/by-nc/3.0/).
+---
+**Disclaimer**: The models are provided "as is" without warranty of any kind. The author is not responsible for any outcomes resulting from the use of these models.