SmartReview: DistilRoBERTa for Smartphone Review Sentiment Analysis

Hugging Face Model License

Model Description

SmartReview is a domain-adapted DistilRoBERTa model fine-tuned for sentiment analysis of smartphone and electronics reviews.

The model achieves 88.23% accuracy on 3-class sentiment classification (Positive, Neutral, Negative) and was specifically trained on 67,987 Amazon smartphone reviews.

🎯 Key Features

  • βœ… Domain-Adapted: Pretrained on 61,553 smartphone reviews via Masked Language Modeling
  • βœ… Efficient: Only 82M parameters (34% smaller than RoBERTa-base)
  • βœ… Accurate: 88.23% overall accuracy, 94.88% F1 on positive sentiment
  • βœ… Fast: ~50ms inference time per review
  • βœ… Specialized: Understands product review vocabulary and context

πŸ—οΈ Architecture

  • Base Model: distilroberta-base (82M parameters)
  • Task: 3-class sequence classification
  • Classes:
    • LABEL_0: Positive
    • LABEL_1: Neutral
    • LABEL_2: Negative
  • Max Length: 512 tokens

πŸ“Š Training Approach

Two-Phase Training:

  1. Phase 1 - Domain Adaptation (MLM)

    • Task: Masked Language Modeling
    • Data: 61,553 smartphone reviews
    • Duration: 66 minutes
    • Result: 99.99% accuracy on domain vocabulary
  2. Phase 2 - Sentiment Fine-tuning

    • Task: 3-class classification
    • Data: 39,044 training samples
    • Duration: 67 minutes
    • Optimizer: AdamW (lr=2e-5, weight_decay=0.01)
    • Hardware: NVIDIA RTX 3050 (4GB)

πŸ“ˆ Performance

Overall Metrics (Test Set: 8,367 reviews)

Metric Score
Accuracy 88.23%
Precision (Macro) 72.38%
Recall (Macro) 72.39%
F1 (Macro) 72.35%
F1 (Weighted) 88.13%

Per-Class Performance

Class Precision Recall F1-Score Support
Positive 95.39% 94.38% 94.88% βœ… 5,481
Neutral 37.79% 35.02% 36.35% ⚠️ 614
Negative 83.96% 87.76% 85.82% βœ… 2,272

Note: Neutral class F1 is lower due to severe class imbalance (only 7.4% of training data). This is expected in product reviews where opinions are rarely truly neutral.

Confusion Matrix

                PREDICTED
           Pos    Neu    Neg
ACTUAL
Pos      5,173   175    133    (94.4% correct)
Neu        151   215    248    (35.0% correct)
Neg         99   179  1,994    (87.8% correct)

πŸš€ Usage

Quick Start

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
model_name = "Abhishek86798/smartreview-distilroberta-sentiment"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Example review
text = "Battery life is excellent but camera quality is poor"

# Tokenize and predict
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

with torch.no_grad():
    outputs = model(**inputs)
    logits = outputs.logits
    probabilities = torch.softmax(logits, dim=-1)
    prediction = logits.argmax(-1).item()

# Map to labels
labels = ["Positive", "Neutral", "Negative"]
sentiment = labels[prediction]
confidence = probabilities[0][prediction].item()

print(f"Sentiment: {sentiment}")
print(f"Confidence: {confidence:.2%}")

Output:

Sentiment: Positive
Confidence: 85.34%

Using Pipeline

from transformers import pipeline

# Create sentiment analysis pipeline
classifier = pipeline(
    "sentiment-analysis",
    model="Abhishek86798/smartreview-distilroberta-sentiment",
    tokenizer="Abhishek86798/smartreview-distilroberta-sentiment"
)

# Single prediction
result = classifier("Amazing phone! Battery lasts all day.")
print(result)
# [{'label': 'LABEL_0', 'score': 0.9876}]  # LABEL_0 = Positive

# Batch prediction
reviews = [
    "Amazing phone! Battery lasts all day.",
    "Terrible. Phone broke after one week.",
    "It's okay, nothing special."
]

results = classifier(reviews)
for review, result in zip(reviews, results):
    print(f"{review} β†’ {result['label']} ({result['score']:.2%})")

Detailed Prediction Function

def predict_sentiment_detailed(text, model, tokenizer):
    # Get detailed sentiment prediction with all probabilities
    # Args: text (str), model, tokenizer
    # Returns: dict with sentiment, confidence, and probabilities
    # Tokenize
    inputs = tokenizer(
        text,
        return_tensors="pt",
        truncation=True,
        max_length=512,
        padding=True
    )
    
    # Predict
    with torch.no_grad():
        outputs = model(**inputs)
        logits = outputs.logits
        probabilities = torch.softmax(logits, dim=-1)[0]
    
    # Get results
    labels = ["Positive", "Neutral", "Negative"]
    prediction_idx = logits.argmax(-1).item()
    
    return {
        "text": text,
        "sentiment": labels[prediction_idx],
        "confidence": probabilities[prediction_idx].item(),
        "probabilities": {
            "positive": probabilities[0].item(),
            "neutral": probabilities[1].item(),
            "negative": probabilities[2].item()
        }
    }

# Example
result = predict_sentiment_detailed(
    "Screen is bright and clear, love the display!",
    model,
    tokenizer
)

print(f"Sentiment: {result['sentiment']}")
print(f"Confidence: {result['confidence']:.2%}")
print(f"Probabilities:")
for sentiment, prob in result['probabilities'].items():
    print(f"  {sentiment.capitalize()}: {prob:.2%}")

πŸ“Š Dataset

Training Data

  • Source: Amazon Cell Phones & Accessories Reviews (Kaggle)
  • Time Period: 2015-2019
  • Total Reviews: 67,987
  • Products: 721 smartphone models

Split Distribution

Split Reviews Percentage
Training 39,044 57.4%
Validation 8,367 12.3%
Test 8,367 12.3%

Sentiment Distribution

Sentiment Count Percentage Rating Mapping
Positive 32,615 57.5% 4-5 stars
Neutral 4,200 7.4% 3 stars
Negative 15,572 27.4% 1-2 stars

🎯 Intended Use

βœ… Recommended Use Cases

  • Sentiment analysis of smartphone/electronics reviews
  • Product feedback analysis for e-commerce platforms
  • Customer satisfaction monitoring
  • Review summarization preprocessing
  • Aspect-based sentiment analysis (as part of ABSA pipeline)

❌ Out-of-Scope Use

  • Non-English reviews (model trained on English only)
  • Non-product reviews (news articles, social media posts, etc.)
  • Offensive content detection
  • Sarcasm detection (known limitation)
  • Real-time chat/conversation analysis

⚠️ Limitations

  1. Neutral Class Performance: F1-score of 36.35% due to severe class imbalance (only 7.4% of training data). The model tends to classify neutral reviews as positive or negative.

  2. Sarcasm Detection: Model struggles with sarcastic language. Example: "Great, another phone that breaks after a week" may be classified as positive.

  3. Domain Specificity: Trained specifically on smartphone reviews. Performance may degrade on other product categories without domain adaptation.

  4. Context-Free Predictions: Doesn't consider user expectations or product price range. "Battery lasts 4 hours" might be negative for smartphones but positive for smartwatches.

  5. Mixed Sentiments: Reviews with multiple conflicting opinions may be misclassified based on the dominant sentiment.


πŸ”§ Training Details

Hyperparameters

Model:
  base_model: distilroberta-base
  num_labels: 3
  max_position_embeddings: 512
  hidden_size: 768
  num_hidden_layers: 6
  num_attention_heads: 12
  dropout: 0.1

Training:
  learning_rate: 2e-5
  batch_size: 4
  gradient_accumulation_steps: 4
  effective_batch_size: 16
  epochs: 5
  warmup_steps: 500
  weight_decay: 0.01
  optimizer: AdamW
  fp16: true
  max_grad_norm: 1.0

Hardware:
  gpu: NVIDIA RTX 3050 (4GB VRAM)
  memory_usage: ~2.5 GB
  training_time: 67 minutes

Training Loss Progression

Epoch Train Loss Val Loss Val Accuracy
1 0.3832 0.3724 87.22%
2 0.2833 0.3274 88.17%
3 0.1935 0.3740 88.22%
4 0.1661 0.4177 88.68%
5 0.1328 0.4728 88.38%

Best Model: Epoch 4 (highest validation accuracy)


🌟 Comparison with Other Models

Model Parameters Accuracy Training Time GPU Memory
SVM (TF-IDF) - 78.4% <5 min <1 GB
LSTM 2M 82.3% ~45 min ~1.5 GB
BERT-base 110M 85.7% ~90 min ~3.2 GB
SmartReview (Ours) 82M 88.23% 67 min 2.5 GB
RoBERTa-base 125M ~89-90% ~120 min ~3.8 GB

Key Advantage: Achieves competitive accuracy with 34% fewer parameters and 44% faster training than RoBERTa-base.


πŸ“ Bias and Fairness

  • Model trained on Amazon reviews from 2015-2019
  • May reflect temporal biases (older smartphone features/expectations)
  • Performance may vary across different price ranges and brands
  • Dataset primarily contains English reviews from US market
  • Recommended to validate on your specific use case and domain

πŸ“š Citation

If you use this model in your research or applications, please cite:

@misc{smartreview2025,
  author = {Abhishek},
  title = {SmartReview: Efficient Aspect-Based Sentiment Analysis using Domain-Adapted DistilRoBERTa},
  year = {2025},
  publisher = {Hugging Face},
  journal = {Hugging Face Model Hub},
  howpublished = {\url{https://huggingface.co/Abhishek86798/smartreview-distilroberta-sentiment}}
}

πŸ”— Additional Resources

  • Project Repository: GitHub - SmartReview
  • Full Technical Report: Available in repository
  • Training Notebooks: 6 complete Jupyter notebooks
  • ABSA Pipeline: Complete aspect-based sentiment analysis system
  • Contact: [Your Email]

πŸ‘₯ Model Card Authors

Abhishek (Abhishek86798)


πŸ“„ License

This model is released under the Apache License 2.0.


πŸ™ Acknowledgments

  • Base Model: distilroberta-base by Hugging Face
  • Dataset: Amazon Reviews dataset (Kaggle)
  • Framework: Hugging Face Transformers
  • Inspiration: Research in domain adaptation and efficient NLP models

πŸ“ž Support

For issues, questions, or feedback:

  • Open an issue on GitHub
  • Contact: [Your Email]
  • Hugging Face Discussions

Model Version: 1.0
Last Updated: November 10, 2025
Status: Production-Ready βœ…


Making advanced sentiment analysis accessible for everyone! πŸš€

Downloads last month
54
Safetensors
Model size
82.1M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Evaluation results