---
library_name: transformers
tags:
- finance
license: mit
datasets:
- Recompense/amazon-appliances-lite-data
language:
- en
base_model:
- meta-llama/Llama-3.1-8B-Instruct
---

# Model Card for Model ID

Predicts Prices based on product description.


## Model Details

### Model Description

This model predicts prices of amazon aplliances data based on a product description


- **Developed by:** https://huggingface.co/Recompense
- **Model type:**  Transformer (causal, autoregressive)
- **Language(s) (NLP):** English
- **License:** MIT
- **Finetuned from model:** meta-llama/Llama-3.1-8B-Instruct

### Model Sources

- **Repository:** [https://huggingface.co/Recompense/Midas-pricer/]

## Uses

- Primary use case: Generating estimated retail prices for household appliances from textual descriptions.

- Example applications:

  Assisting e-commerce teams in setting competitive price points

  Supporting market analysis dashboards with on-the-fly price estimates

- Not intended for: Financial advice or investment decisions


### Out-of-Scope Use

- Attempting to predict prices outside the appliances domain (e.g., electronics, furniture, vehicles) will likely yield unreliable results.

- Using this model for any price-sensitive or regulatory decisions without human oversight is discouraged.

## Bias, Risks, and Limitations

- Data biases: The training dataset is drawn exclusively from Amazon appliance listings. Price distributions are skewed toward mid-range consumer electronics; extreme low or high‐end appliances are underrepresented.

- Input sensitivity: Minor changes in phrasing or additional noisy tokens can shift predictions noticeably.

- Generalization: The model does not understand supply chain disruptions, seasonality, or promotions—it only captures patterns seen in historical listing data.


### Recommendations

- Always validate model outputs against a small set of ground-truth prices before production deployment.

- Use this model as an assistant, not an oracle: incorporate downstream business rules or domain expertise.

- Regularly retrain or fine-tune on updated listing data to capture shifting market trends.

## How to Get Started with the Model

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("Recompense/Midas-pricer")
model = AutoModelForCausalLM.from_pretrained("Recompense/Midas-pricer", torch_dtype=torch.bfloat16)

# Prepare prompt
product_desc = "How much does this cost to the nearest dollar?\n\nSamsung 7kg top-load washing machine with digital inverter motor"
prompt = f"{product_desc}\n\nPrice is $"

# Tokenize and generate
inputs = tokenizer(prompt, return_tensors="pt")
attention_mask = torch.ones(inputs.shape, device="cuda")
generated = model.generate(inputs, attention_mask=attention_mask, max_new_tokens=3, num_return_sequences=1)
price_text = tokenizer.decode(generated[0], skip_special_tokens=True)

print(f"Estimated price: ${price_text}")
```


## Training Details

### Training Data

- Dataset: Recompense/amazon-appliances-lite-data

- Train/validation/test split: 80/10/10

### Training Procedure

#### Training Hyperparameters

- Fine-tuning framework: PyTorch + Hugging Face Accelerate

- Precision: bf16 mixed precision

- Batch size: 1 sequence

- Learning rate: 1e-5 with linear warmup (10% of total steps)

- Optimizer: AdamW

## Evaluation


### Testing Data, Factors & Metrics
- Test set: Held-out 10% of listings (≈5 000 examples)

- Metric: Root Mean Squared Logarithmic Error (RMSLE)

- Hit@$40: Percentage of predictions within ±$40 of true price

| Metric   | Value  |
| -------- | ------ |
| RMSLE    | 0.61   |
| Hit@\$40 | 85.2 % |


#### Summary
The model achieves an RMSLE of 0.61, indicating good alignment between predicted and actual prices on a log scale,
and correctly estimates within $40 in over 85% of test cases.
This performance is competitive for rapid prototyping in price-sensitive applications.

## Environmental Impact

- Approximate compute emissions for fine-tuning (using ML CO₂ impact calculator):

- Hardware: Tesla T4

- Duration: 2 hours(0.06 epoch)

- Cloud provider: Google Cloud, region US-Central

- Estimated CO₂ emitted: 6 kg CO₂e

## Technical Specifications
## Model Architecture
- Base model: Llama-3.1-8B (8 billion parameters)

- Objective: Autoregressive language modeling with instruction tuning

## Compute Infrastructure
- Hardware: 4× Tesla T4 GPUs

- Software:

  PyTorch 2.x

  transformers 5.x

  accelerate 1.x

  bitsandbytes (for 8-bit quantization optional inference)


## Glossary

- RMSLE (Root Mean Squared Logarithmic Error):
Measures the square root of the average squared difference between log-transformed predictions and targets. Less sensitive to large absolute errors.

- Hit@$40:
  Fraction of predictions whose absolute error is ≤ $40.


## Model Card Authors

Damola Jimoh(Recompense)