SmartReview: DistilRoBERTa for Smartphone Review Sentiment Analysis

Model Description

SmartReview is a domain-adapted DistilRoBERTa model fine-tuned for sentiment analysis of smartphone and electronics reviews.

The model achieves 88.23% accuracy on 3-class sentiment classification (Positive, Neutral, Negative) and was specifically trained on 67,987 Amazon smartphone reviews.

🎯 Key Features

✅ Domain-Adapted: Pretrained on 61,553 smartphone reviews via Masked Language Modeling
✅ Efficient: Only 82M parameters (34% smaller than RoBERTa-base)
✅ Accurate: 88.23% overall accuracy, 94.88% F1 on positive sentiment
✅ Fast: ~50ms inference time per review
✅ Specialized: Understands product review vocabulary and context

🏗️ Architecture

Base Model: distilroberta-base (82M parameters)
Task: 3-class sequence classification
Classes:
- LABEL_0: Positive
- LABEL_1: Neutral
- LABEL_2: Negative
Max Length: 512 tokens

📊 Training Approach

Two-Phase Training:

Phase 1 - Domain Adaptation (MLM)
- Task: Masked Language Modeling
- Data: 61,553 smartphone reviews
- Duration: 66 minutes
- Result: 99.99% accuracy on domain vocabulary
Phase 2 - Sentiment Fine-tuning
- Task: 3-class classification
- Data: 39,044 training samples
- Duration: 67 minutes
- Optimizer: AdamW (lr=2e-5, weight_decay=0.01)
- Hardware: NVIDIA RTX 3050 (4GB)

📈 Performance

Overall Metrics (Test Set: 8,367 reviews)

Metric	Score
Accuracy	88.23%
Precision (Macro)	72.38%
Recall (Macro)	72.39%
F1 (Macro)	72.35%
F1 (Weighted)	88.13%

Per-Class Performance

Class	Precision	Recall	F1-Score	Support
Positive	95.39%	94.38%	94.88% ✅	5,481
Neutral	37.79%	35.02%	36.35% ⚠️	614
Negative	83.96%	87.76%	85.82% ✅	2,272

Note: Neutral class F1 is lower due to severe class imbalance (only 7.4% of training data). This is expected in product reviews where opinions are rarely truly neutral.

Confusion Matrix

                PREDICTED
           Pos    Neu    Neg
ACTUAL
Pos      5,173   175    133    (94.4% correct)
Neu        151   215    248    (35.0% correct)
Neg         99   179  1,994    (87.8% correct)

🚀 Usage

Quick Start

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
model_name = "Abhishek86798/smartreview-distilroberta-sentiment"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Example review
text = "Battery life is excellent but camera quality is poor"

# Tokenize and predict
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

with torch.no_grad():
    outputs = model(**inputs)
    logits = outputs.logits
    probabilities = torch.softmax(logits, dim=-1)
    prediction = logits.argmax(-1).item()

# Map to labels
labels = ["Positive", "Neutral", "Negative"]
sentiment = labels[prediction]
confidence = probabilities[0][prediction].item()

print(f"Sentiment: {sentiment}")
print(f"Confidence: {confidence:.2%}")

Output:

Sentiment: Positive
Confidence: 85.34%

Using Pipeline

from transformers import pipeline

# Create sentiment analysis pipeline
classifier = pipeline(
    "sentiment-analysis",
    model="Abhishek86798/smartreview-distilroberta-sentiment",
    tokenizer="Abhishek86798/smartreview-distilroberta-sentiment"
)

# Single prediction
result = classifier("Amazing phone! Battery lasts all day.")
print(result)
# [{'label': 'LABEL_0', 'score': 0.9876}]  # LABEL_0 = Positive

# Batch prediction
reviews = [
    "Amazing phone! Battery lasts all day.",
    "Terrible. Phone broke after one week.",
    "It's okay, nothing special."
]

results = classifier(reviews)
for review, result in zip(reviews, results):
    print(f"{review} → {result['label']} ({result['score']:.2%})")

Detailed Prediction Function

def predict_sentiment_detailed(text, model, tokenizer):
    # Get detailed sentiment prediction with all probabilities
    # Args: text (str), model, tokenizer
    # Returns: dict with sentiment, confidence, and probabilities
    # Tokenize
    inputs = tokenizer(
        text,
        return_tensors="pt",
        truncation=True,
        max_length=512,
        padding=True
    )
    
    # Predict
    with torch.no_grad():
        outputs = model(**inputs)
        logits = outputs.logits
        probabilities = torch.softmax(logits, dim=-1)[0]
    
    # Get results
    labels = ["Positive", "Neutral", "Negative"]
    prediction_idx = logits.argmax(-1).item()
    
    return {
        "text": text,
        "sentiment": labels[prediction_idx],
        "confidence": probabilities[prediction_idx].item(),
        "probabilities": {
            "positive": probabilities[0].item(),
            "neutral": probabilities[1].item(),
            "negative": probabilities[2].item()
        }
    }

# Example
result = predict_sentiment_detailed(
    "Screen is bright and clear, love the display!",
    model,
    tokenizer
)

print(f"Sentiment: {result['sentiment']}")
print(f"Confidence: {result['confidence']:.2%}")
print(f"Probabilities:")
for sentiment, prob in result['probabilities'].items():
    print(f"  {sentiment.capitalize()}: {prob:.2%}")

📊 Dataset

Training Data

Source: Amazon Cell Phones & Accessories Reviews (Kaggle)
Time Period: 2015-2019
Total Reviews: 67,987
Products: 721 smartphone models

Split Distribution

Split	Reviews	Percentage
Training	39,044	57.4%
Validation	8,367	12.3%
Test	8,367	12.3%

Sentiment Distribution

Sentiment	Count	Percentage	Rating Mapping
Positive	32,615	57.5%	4-5 stars
Neutral	4,200	7.4%	3 stars
Negative	15,572	27.4%	1-2 stars

🎯 Intended Use

✅ Recommended Use Cases

Sentiment analysis of smartphone/electronics reviews
Product feedback analysis for e-commerce platforms
Customer satisfaction monitoring
Review summarization preprocessing
Aspect-based sentiment analysis (as part of ABSA pipeline)

❌ Out-of-Scope Use

Non-English reviews (model trained on English only)
Non-product reviews (news articles, social media posts, etc.)
Offensive content detection
Sarcasm detection (known limitation)
Real-time chat/conversation analysis

⚠️ Limitations

Neutral Class Performance: F1-score of 36.35% due to severe class imbalance (only 7.4% of training data). The model tends to classify neutral reviews as positive or negative.
Sarcasm Detection: Model struggles with sarcastic language. Example: "Great, another phone that breaks after a week" may be classified as positive.
Domain Specificity: Trained specifically on smartphone reviews. Performance may degrade on other product categories without domain adaptation.
Context-Free Predictions: Doesn't consider user expectations or product price range. "Battery lasts 4 hours" might be negative for smartphones but positive for smartwatches.
Mixed Sentiments: Reviews with multiple conflicting opinions may be misclassified based on the dominant sentiment.

🔧 Training Details

Hyperparameters

Model:
  base_model: distilroberta-base
  num_labels: 3
  max_position_embeddings: 512
  hidden_size: 768
  num_hidden_layers: 6
  num_attention_heads: 12
  dropout: 0.1

Training:
  learning_rate: 2e-5
  batch_size: 4
  gradient_accumulation_steps: 4
  effective_batch_size: 16
  epochs: 5
  warmup_steps: 500
  weight_decay: 0.01
  optimizer: AdamW
  fp16: true
  max_grad_norm: 1.0

Hardware:
  gpu: NVIDIA RTX 3050 (4GB VRAM)
  memory_usage: ~2.5 GB
  training_time: 67 minutes

Training Loss Progression

Epoch	Train Loss	Val Loss	Val Accuracy
1	0.3832	0.3724	87.22%
2	0.2833	0.3274	88.17%
3	0.1935	0.3740	88.22%
4	0.1661	0.4177	88.68%
5	0.1328	0.4728	88.38%

Best Model: Epoch 4 (highest validation accuracy)

🌟 Comparison with Other Models

Model	Parameters	Accuracy	Training Time	GPU Memory
SVM (TF-IDF)	-	78.4%	<5 min	<1 GB
LSTM	2M	82.3%	~45 min	~1.5 GB
BERT-base	110M	85.7%	~90 min	~3.2 GB
SmartReview (Ours)	82M	88.23%	67 min	2.5 GB
RoBERTa-base	125M	~89-90%	~120 min	~3.8 GB

Key Advantage: Achieves competitive accuracy with 34% fewer parameters and 44% faster training than RoBERTa-base.

📝 Bias and Fairness

Model trained on Amazon reviews from 2015-2019
May reflect temporal biases (older smartphone features/expectations)
Performance may vary across different price ranges and brands
Dataset primarily contains English reviews from US market
Recommended to validate on your specific use case and domain

📚 Citation

If you use this model in your research or applications, please cite:

@misc{smartreview2025,
  author = {Abhishek},
  title = {SmartReview: Efficient Aspect-Based Sentiment Analysis using Domain-Adapted DistilRoBERTa},
  year = {2025},
  publisher = {Hugging Face},
  journal = {Hugging Face Model Hub},
  howpublished = {\url{https://huggingface.co/Abhishek86798/smartreview-distilroberta-sentiment}}
}

🔗 Additional Resources

Project Repository: GitHub - SmartReview
Full Technical Report: Available in repository
Training Notebooks: 6 complete Jupyter notebooks
ABSA Pipeline: Complete aspect-based sentiment analysis system
Contact: [Your Email]

👥 Model Card Authors

Abhishek (Abhishek86798)

📄 License

This model is released under the Apache License 2.0.

🙏 Acknowledgments

Base Model: distilroberta-base by Hugging Face
Dataset: Amazon Reviews dataset (Kaggle)
Framework: Hugging Face Transformers
Inspiration: Research in domain adaptation and efficient NLP models

📞 Support

For issues, questions, or feedback:

Open an issue on GitHub
Contact: [Your Email]
Hugging Face Discussions

Model Version: 1.0
Last Updated: November 10, 2025
Status: Production-Ready ✅

Making advanced sentiment analysis accessible for everyone! 🚀

Downloads last month: 54

Safetensors

Model size

82.1M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Evaluation results

Test Accuracy on Amazon Smartphone Reviews
self-reported

88.230
F1 Score (Positive) on Amazon Smartphone Reviews
self-reported

94.880
F1 Score (Negative) on Amazon Smartphone Reviews
self-reported

85.820
F1 Score (Neutral) on Amazon Smartphone Reviews
self-reported

36.350

View on Papers With Code