---
license: mit
library_name: pytorch
tags:
- time-series-forecasting
- supply-chain
- demand-forecasting
- timellm
- llama
- aws-sagemaker
- time-series
base_model: meta-llama/Llama-3.2-3B
pipeline_tag: time-series-forecasting
datasets:
- supply-chain-data
language:
- en
metrics:
- mse
- mae
---

# TimeLLM Supply Chain Demand Forecasting Model

<p align="center">
  <img src="https://img.shields.io/badge/AWS-SageMaker-232F3E?logo=amazon-aws&logoColor=white" alt="AWS SageMaker"/>
  <img src="https://img.shields.io/badge/HuggingFace-Llama--3.2--3B-orange?logo=huggingface" alt="HuggingFace Model"/>
  <img src="https://img.shields.io/badge/python-3.10%2B-blue?logo=python" alt="Python"/>
  <img src="https://img.shields.io/badge/License-MIT-yellow.svg" alt="License"/>
</p>

![](./media/projectnexus-arch.png)

This model is a fine-tuned [TimeLLM](https://github.com/KimMeen/Time-LLM) (Time Series Large Language Model) for supply chain demand forecasting, trained on AWS SageMaker. TimeLLM is a reprogramming framework that repurposes LLMs for general time series forecasting while keeping the backbone language models intact.

**🏆 Built for the [GenAI Hackathon by Impetus & AWS](https://impetusawsgenaihackathon.devpost.com/) (TimeLLM Supply Chain Optimization category)**

## Model Details

- **Model Type**: Time Series Forecasting
- **Base Model**: Meta LLaMA 3.2-3B
- **Architecture**: TimeLLM with transformer encoder-decoder
- **Training Platform**: AWS SageMaker
- **Training Hardware**: `ml.g5.12xlarge` (4 NVIDIA A10G GPUs, 48 vCPUs, 192 GB RAM)
- **Inference Hardware**: `ml.g5.xlarge` (1 NVIDIA A10G GPU, 4 vCPUs, 16 GB RAM)
- **Training Duration**: 1114 seconds (~18.5 minutes)
- **Training Status**: Completed Successfully
- **Framework**: PyTorch 2.0.0
- **Model Size**: 2.2 GB

## Training Configuration

| Parameter | Value |
|-----------|-------|
| **Sequence Length** | 96 timesteps |
| **Prediction Length** | 96 timesteps |
| **Label Length** | 48 timesteps |
| **Features** | 14 supply chain features |
| **Model Dimensions** | d_model=16, d_ff=32, n_heads=8 |
| **Architecture** | e_layers=2, d_layers=1, factor=3 |
| **Patch Configuration** | patch_len=16, stride=8 |
| **Epochs** | 10 (with early stopping) |
| **Batch Size** | 32 |
| **Learning Rate** | 0.0001 |
| **Optimization** | DeepSpeed ZeRO Stage 2, Mixed Precision |

## Supply Chain Features

The model forecasts demand using 14 key supply chain features:

| Feature Category | Features |
|------------------|----------|
| **Sales Metrics** | Quantity, Line Total, Unit Price |
| **Promotions** | Discount Percentage, Promotion Indicators, Promo Discount |
| **Returns** | Return Quantity, Return Rate |
| **Inventory** | Stock Status (Stockout, Low Stock), Stock Coverage |
| **Temporal** | Day of Week, Month, Quarter |

## Use Cases

- **🎯 Demand Forecasting**: Predict future product demand patterns
- **📦 Inventory Planning**: Optimize stock levels and procurement
- **📈 Sales Prediction**: Forecast sales across multiple time horizons
- **🔄 Supply Chain Optimization**: Handle complex temporal dependencies

## Quick Start

### Prerequisites

**⚠️ Important**: This model requires access to Meta LLaMA 3.2-3B, which is a gated model.

1. **Request Access**: Visit [meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B) and request access
2. **Generate Token**: Create a HuggingFace token with "Read" permissions
3. **Set Environment**: `export HF_TOKEN="hf_your_token_here"`

### Installation

```bash
# Clone the repository
git clone https://github.com/youneslaaroussi/project-nexus
cd project-nexus/ml

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install dependencies
pip install -r TimeLLM/requirements.txt
```

### Using the Model

```python
from modeling_timellm import TimeLLMForecaster
import numpy as np

# Initialize the forecaster
forecaster = TimeLLMForecaster(
    model_path="model.pth",
    config_path="config.json"
)

# Prepare your data (96 timesteps, 14 features)
historical_data = np.random.randn(96, 14)  # Replace with your actual data
time_features = np.random.randn(96, 3)     # month, day, weekday

# Generate forecast
forecast = forecaster.forecast(historical_data, time_features)
print(f"Forecast shape: {forecast.shape}")  # (96, 14)
```

## Training from Scratch

### 1. Data Preparation

```bash
# Generate synthetic ERP data
cd data_schema
python generate_data.py

# Transform to time series format
cd ../data_preprocessing
python erp_to_timeseries.py
```

### 2. Configure AWS Environment

```bash
# Configure AWS credentials
aws configure

# Set environment variables
export AWS_ACCESS_KEY_ID=your_access_key
export AWS_SECRET_ACCESS_KEY=your_secret_key
export AWS_DEFAULT_REGION=us-east-1
export HF_TOKEN="hf_your_huggingface_token"
```

### 3. Launch Training on SageMaker

```bash
cd sagemaker_deployment

# Train the model (uses ml.g5.12xlarge)
python launch_sagemaker_training.py --model-name Demand_Forecasting

# Monitor training progress
aws sagemaker describe-training-job --training-job-name TimeLLM-training-Demand-Forecasting-YYYY-MM-DD-HH-MM-SS
```

### 4. Deploy for Inference

```bash
# Deploy endpoint (uses ml.g5.xlarge)
python deploy_endpoint.py

# Test the endpoint
python test_inference.py
```

## Docker Deployment

### Build Container

```bash
cd sagemaker_deployment

# Build the inference container
docker build -t timellm-inference:latest --build-arg HF_TOKEN=hf_your_token .

# Tag for ECR
docker tag timellm-inference:latest {account-id}.dkr.ecr.us-east-1.amazonaws.com/timellm-inference:latest

# Push to ECR
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin {account-id}.dkr.ecr.us-east-1.amazonaws.com
docker push {account-id}.dkr.ecr.us-east-1.amazonaws.com/timellm-inference:latest
```

### Dockerfile Structure

```dockerfile
FROM 763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-inference:2.0.0-gpu-py310

# Install dependencies
COPY requirements.txt /tmp/requirements.txt
RUN pip install -r /tmp/requirements.txt

# Copy model artifacts
COPY model.tar.gz /opt/ml/model/
COPY llm_weights /opt/llm_weights

# Set up inference handler
COPY inference.py /opt/ml/model/code/
```

## Performance Optimization

### Hardware Specifications

| **Component** | **Training (ml.g5.12xlarge)** | **Inference (ml.g5.xlarge)** |
|---------------|--------------------------------|-------------------------------|
| **GPUs** | 4x NVIDIA A10G (24GB each) | 1x NVIDIA A10G (24GB) |
| **vCPUs** | 48 | 4 |
| **Memory** | 192 GB | 16 GB |
| **Network** | Up to 50 Gbps | Up to 10 Gbps |
| **Cost** | ~$7.09/hour | ~$0.526/hour |

### Optimization Techniques

- **🚀 DeepSpeed ZeRO Stage 2**: Reduces memory usage by 50-70%
- **⚡ Mixed Precision (FP16)**: Faster training with maintained accuracy
- **🔄 Gradient Accumulation**: Simulates larger batch sizes
- **📊 Distributed Training**: Multi-GPU acceleration with HuggingFace Accelerate

### Cost Analysis

| **Operation** | **Cost** | **Duration** |
|---------------|----------|--------------|
| **Training** | ~$2.13 | ~18.5 minutes |
| **Inference** | ~$0.526/hour | Continuous |
| **Storage (S3)** | ~$0.023/GB/month | Model artifacts |

## Data Format

### Input Format

```python
{
    "x_enc": [
        [  # Timestep 1
            100,    # quantity
            1000.0, # line_total
            10.0,   # unit_price
            0.05,   # discount_percent
            0,      # is_promotion
            0.0,    # promo_discount
            2,      # return_quantity
            0.02,   # return_rate
            0,      # is_stockout
            0,      # is_low_stock
            30,     # stock_coverage
            0,      # day_of_week
            1,      # month
            1       # quarter
        ],
        # ... 95 more timesteps
    ],
    "x_mark_enc": [
        [1, 1, 0],  # month, day, weekday for timestep 1
        # ... 95 more timesteps
    ]
}
```

### Output Format

```python
{
    "predictions": [
        [  # Predicted timestep 1
            105,    # forecasted quantity
            1050.0, # forecasted line_total
            # ... 12 more forecasted features
        ],
        # ... 95 more predicted timesteps
    ]
}
```

## AWS SageMaker Integration

### Training Job Configuration

```python
from sagemaker.pytorch import PyTorch

estimator = PyTorch(
    entry_point='train_supply_chain_complete.py',
    source_dir='../TimeLLM',
    role=sagemaker_role,
    instance_type='ml.g5.12xlarge',
    instance_count=1,
    framework_version='2.0.0',
    py_version='py310',
    hyperparameters={
        'model_name': 'Demand_Forecasting',
        'root_path': '/opt/ml/input/data/training'
    }
)
```

### Endpoint Configuration

```python
from sagemaker.pytorch import PyTorchModel

model = PyTorchModel(
    model_data=model_artifacts_uri,
    role=sagemaker_role,
    entry_point='inference.py',
    framework_version='2.0.0',
    py_version='py310'
)

predictor = model.deploy(
    initial_instance_count=1,
    instance_type='ml.g5.xlarge',
    endpoint_name='timellm-demand-forecast-endpoint'
)
```

## Monitoring and Logging

### CloudWatch Integration

- **Training Logs**: `/aws/sagemaker/TrainingJobs/{job-name}`
- **Endpoint Logs**: `/aws/sagemaker/Endpoints/{endpoint-name}`
- **Custom Metrics**: Model performance, latency, error rates

### Performance Metrics

| **Metric** | **Description** |
|------------|-----------------|
| **MAE** | Mean Absolute Error |
| **MSE** | Mean Squared Error |
| **MAPE** | Mean Absolute Percentage Error |
| **Latency** | Inference response time (~2-3 seconds) |

## TimeLLM Framework

This implementation is based on the TimeLLM framework, which introduces:

1. **🔄 Reprogramming**: Converts time series into text prototype representations
2. **💬 Prompt Augmentation**: Uses declarative prompts for domain knowledge
3. **🦙 LLM Backbone**: Leverages pre-trained language models for forecasting

### Key Modifications

- **Supply Chain Prompts**: Domain-specific prompts for demand forecasting
- **HuggingFace Integration**: Seamless model loading and tokenization
- **AWS Optimization**: SageMaker-specific inference handlers
- **Performance Tuning**: DeepSpeed and mixed precision support

## Model Variants

| **Model** | **Purpose** | **Use Case** |
|-----------|-------------|--------------|
| **Demand Forecasting** | Predict future product demand | Inventory planning, procurement |
| **Product Forecasting** | Product-specific metrics | Product lifecycle management |
| **Category Forecasting** | Electronics category sales | Category management, marketing |
| **KPI Forecasting** | Key performance indicators | Executive dashboards, strategic planning |

## Troubleshooting

### Common Issues

1. **HuggingFace Access Denied**
   ```bash
   # Verify token access
   python -c "from transformers import AutoTokenizer; AutoTokenizer.from_pretrained('meta-llama/Llama-3.2-3B')"
   ```

2. **Training Job Fails**
   ```bash
   # Check CloudWatch logs
   aws logs describe-log-groups --log-group-name-prefix "/aws/sagemaker/TrainingJobs"
   ```

3. **Endpoint Timeout**
   ```bash
   # Check endpoint status
   aws sagemaker describe-endpoint --endpoint-name timellm-demand-forecast-endpoint
   ```

## Citations

### This Model
```bibtex
@misc{projectnexus-timellm-2025,
  title={TimeLLM Supply Chain Demand Forecasting},
  author={Younes Laaroussi},
  year={2025},
  howpublished={Hugging Face Model Hub},
  url={https://huggingface.co/youneslaaroussi/projectnexus},
  note={Trained on AWS SageMaker ml.g5.12xlarge using TimeLLM framework}
}
```

### TimeLLM Framework
```bibtex
@inproceedings{jin2023time,
  title={{Time-LLM}: Time series forecasting by reprogramming large language models},
  author={Jin, Ming and Wang, Shiyu and Ma, Lintao and Chu, Zhixuan and Zhang, James Y and Shi, Xiaoming and Chen, Pin-Yu and Liang, Yuxuan and Li, Yuan-Fang and Pan, Shirui and Wen, Qingsong},
  booktitle={International Conference on Learning Representations (ICLR)},
  year={2024}
}
```

## License

This model is released under the MIT License, consistent with the TimeLLM framework.

## Acknowledgments

- **[TimeLLM](https://github.com/KimMeen/Time-LLM)** for the foundational framework
- **[AWS SageMaker](https://aws.amazon.com/sagemaker/)** for the training infrastructure
- **[Meta LLaMA](https://huggingface.co/meta-llama/Llama-3.2-3B)** for the base model
- **[HuggingFace](https://huggingface.co/)** for model hosting and transformers library
- **[DeepSpeed](https://github.com/microsoft/DeepSpeed)** for optimization techniques