CS410 Final Project -- Amazon Review Summarization and Sentiment Analysis

Dataset

Available here, we use four categories, All_Beauty, Digital_Music, Handmade_Product, and Health_and_Personal_Care.

Workflow

Data preprocessing -> Sentiment classification (group positive and negative reviews to proceed) -> Fine-tuning summarization model on training data -> Evaluate summarization model on test data.

Models

  • Sentiment classification uses pre-trained DistillBERT.
  • Review summarization uses facebook/bart-large-cnn fine-tuned on category of review dataset.

Layout

  • checkpoints folder contains fine-tuned models for each specific categories of dataset.
  • src folder contains source code.
  • docs records experiments results.

Usage

Run sentiment classification

python src/classification.py [category]

Run fine-tuning

python src/finetune.py [category]

Run summarization, you should firstly obtain an Anthropic Claude API key, and

export ANTHROPIC_API_KEY='your-api-key-here'

then

python src/summarization.py
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for yunqili4/cs410-final-project

Finetuned
(415)
this model