# Deployment Guide

This guide covers deploying the AI API Service to various platforms.

## Table of Contents
- [Local Development](#local-development)
- [Docker Deployment](#docker-deployment)
- [Encore Cloud](#encore-cloud)
- [Hugging Face Spaces](#hugging-face-spaces)
- [AWS Deployment](#aws-deployment)
- [Google Cloud Platform](#google-cloud-platform)
- [Azure Deployment](#azure-deployment)
- [Environment Variables](#environment-variables)

## Local Development

### Prerequisites
- Node.js 18+
- npm or yarn
- Encore CLI

### Steps

1. **Install Encore CLI**
```bash
npm install -g encore
```

2. **Install dependencies**
```bash
npm install
```

3. **Configure environment**
```bash
cp .env.example .env
# Edit .env with your API keys
```

4. **Run development server**
```bash
encore run
```

The API will be available at `http://localhost:8000`

## Docker Deployment

### Build and Run Locally

```bash
docker-compose up -d
```

This starts:
- API service on port 8000
- Redis for caching (optional)

### Build Production Image

```bash
docker build -t ai-api-service:latest .
```

### Run Production Container

```bash
docker run -d \
  -p 8000:8000 \
  -e OPENAI_API_KEY=your_key \
  -e API_KEYS=your_api_keys \
  --name ai-api \
  ai-api-service:latest
```

## Encore Cloud

Encore Cloud provides the easiest deployment experience with automatic infrastructure provisioning.

### Steps

1. **Install Encore CLI**
```bash
npm install -g encore
```

2. **Login to Encore**
```bash
encore auth login
```

3. **Create app (first time)**
```bash
encore app create ai-api-service
```

4. **Set secrets**
```bash
encore secret set OPENAI_API_KEY
encore secret set HUGGINGFACE_API_KEY
encore secret set PINECONE_API_KEY
```

5. **Deploy**
```bash
encore deploy
```

Your API will be deployed with:
- Auto-scaling
- Load balancing
- SSL/TLS certificates
- Monitoring and logs
- Database backups

## Hugging Face Spaces

Deploy as a Docker Space on Hugging Face for easy sharing.

### Steps

1. **Create new Space**
   - Go to https://huggingface.co/new-space
   - Select "Docker" as SDK
   - Choose hardware tier (CPU or GPU)

2. **Clone Space repository**
```bash
git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE
cd YOUR_SPACE
```

3. **Copy project files**
```bash
cp -r /path/to/ai-api-service/* .
```

4. **Create Dockerfile for HF Spaces**
```dockerfile
FROM node:18-alpine

WORKDIR /app

COPY package*.json ./
RUN npm ci --only=production

COPY . .

ENV PORT=7860
EXPOSE 7860

CMD ["npm", "start"]
```

5. **Configure secrets in Space settings**
   - `OPENAI_API_KEY`
   - `HUGGINGFACE_API_KEY`
   - `API_KEYS`

6. **Push to Space**
```bash
git add .
git commit -m "Initial deployment"
git push
```

## AWS Deployment

### Using AWS ECS (Elastic Container Service)

1. **Push image to ECR**
```bash
aws ecr create-repository --repository-name ai-api-service

docker build -t ai-api-service .

aws ecr get-login-password --region us-east-1 | \
  docker login --username AWS --password-stdin \
  YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com

docker tag ai-api-service:latest \
  YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/ai-api-service:latest

docker push YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/ai-api-service:latest
```

2. **Create ECS Task Definition**
```json
{
  "family": "ai-api-service",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "1024",
  "memory": "2048",
  "containerDefinitions": [{
    "name": "ai-api",
    "image": "YOUR_ACCOUNT.dkr.ecr.us-east-1.amazonaws.com/ai-api-service:latest",
    "portMappings": [{
      "containerPort": 8000,
      "protocol": "tcp"
    }],
    "environment": [],
    "secrets": [{
      "name": "OPENAI_API_KEY",
      "valueFrom": "arn:aws:secretsmanager:us-east-1:ACCOUNT:secret:openai-api-key"
    }]
  }]
}
```

3. **Create ECS Service with ALB**
   - Configure Application Load Balancer
   - Set up target group (port 8000)
   - Configure auto-scaling
   - Add health checks

### Using AWS Lambda (API Gateway)

For serverless deployment, wrap endpoints with AWS Lambda handlers.

## Google Cloud Platform

### Using Cloud Run

1. **Build and push to GCR**
```bash
gcloud builds submit --tag gcr.io/PROJECT_ID/ai-api-service

gcloud run deploy ai-api-service \
  --image gcr.io/PROJECT_ID/ai-api-service \
  --platform managed \
  --region us-central1 \
  --allow-unauthenticated \
  --set-env-vars OPENAI_API_KEY=your_key
```

2. **Configure secrets**
```bash
echo -n "your_openai_key" | \
  gcloud secrets create openai-api-key --data-file=-

gcloud run services update ai-api-service \
  --update-secrets OPENAI_API_KEY=openai-api-key:latest
```

### Using GKE (Kubernetes)

1. **Create cluster**
```bash
gcloud container clusters create ai-api-cluster \
  --num-nodes=3 \
  --machine-type=n1-standard-2
```

2. **Deploy application**
```bash
kubectl apply -f k8s/deployment.yaml
kubectl apply -f k8s/service.yaml
kubectl apply -f k8s/ingress.yaml
```

## Azure Deployment

### Using Azure Container Instances

```bash
az container create \
  --resource-group ai-api-rg \
  --name ai-api-service \
  --image your-registry.azurecr.io/ai-api-service:latest \
  --cpu 2 \
  --memory 4 \
  --ports 8000 \
  --environment-variables \
    PORT=8000 \
  --secure-environment-variables \
    OPENAI_API_KEY=your_key \
    API_KEYS=demo-key-1
```

### Using Azure App Service

1. **Create App Service Plan**
```bash
az appservice plan create \
  --name ai-api-plan \
  --resource-group ai-api-rg \
  --is-linux \
  --sku B1
```

2. **Create Web App**
```bash
az webapp create \
  --resource-group ai-api-rg \
  --plan ai-api-plan \
  --name ai-api-service \
  --deployment-container-image-name your-registry.azurecr.io/ai-api-service:latest
```

3. **Configure settings**
```bash
az webapp config appsettings set \
  --resource-group ai-api-rg \
  --name ai-api-service \
  --settings \
    OPENAI_API_KEY=@Microsoft.KeyVault(SecretUri=...)
```

## Environment Variables

### Required Variables

| Variable | Description | Example |
|----------|-------------|---------|
| `API_KEYS` | Comma-separated API keys | `key1,key2,key3` |
| `OPENAI_API_KEY` | OpenAI API key (or alternative) | `sk-...` |

### Optional Variables

| Variable | Description | Default |
|----------|-------------|---------|
| `HUGGINGFACE_API_KEY` | HuggingFace API key | - |
| `ANTHROPIC_API_KEY` | Anthropic API key | - |
| `PINECONE_API_KEY` | Pinecone vector DB key | - |
| `RATE_LIMIT_DEFAULT` | Requests/min for default tier | `60` |
| `RATE_LIMIT_ADMIN` | Requests/min for admin tier | `1000` |
| `LOG_LEVEL` | Logging level | `info` |
| `MAX_FILE_SIZE_MB` | Max upload size in MB | `10` |

### Setting Secrets

**Encore Cloud:**
```bash
encore secret set OPENAI_API_KEY
```

**Docker:**
```bash
docker run -e OPENAI_API_KEY=your_key ...
```

**Kubernetes:**
```bash
kubectl create secret generic api-secrets \
  --from-literal=OPENAI_API_KEY=your_key
```

**AWS Secrets Manager:**
```bash
aws secretsmanager create-secret \
  --name openai-api-key \
  --secret-string your_key
```

## Monitoring

### Health Checks

Configure health check endpoint:
```
GET /health
```

Expected response:
```json
{
  "status": "healthy",
  "version": "1.0.0",
  "services": [...]
}
```

### Metrics

Access metrics at:
```
GET /metrics
```

### Logging

Logs are output as structured JSON:
```json
{
  "timestamp": "2025-10-01T12:00:00Z",
  "level": "info",
  "message": "Request processed",
  "duration_ms": 245
}
```

## Scaling Recommendations

### Horizontal Scaling
- Start with 2-3 replicas
- Auto-scale based on CPU (70% threshold)
- Use load balancer for distribution

### Vertical Scaling
- Minimum: 1 CPU, 2GB RAM
- Recommended: 2 CPU, 4GB RAM
- High traffic: 4 CPU, 8GB RAM

### Database Scaling
- Use Pinecone for production vector storage
- Implement Redis for caching
- Consider read replicas for high traffic

## Troubleshooting

### Common Issues

**"No LLM adapter available"**
- Check that at least one API key is set (OpenAI, HuggingFace, or Anthropic)

**"Rate limit exceeded"**
- Increase rate limits in environment variables
- Use admin API key for testing

**"Vector DB connection failed"**
- Service falls back to in-memory storage
- Check Pinecone credentials

**High latency**
- Enable caching (Redis)
- Use closer region for APIs
- Optimize model selection

## Support

For deployment assistance:
- GitHub Issues
- Documentation at docs/
- Community Discord