File size: 8,517 Bytes
d61feef
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
# Quick Start Guide

Get your AI API Service up and running in 5 minutes!

## Prerequisites

- Node.js 18+
- npm or yarn
- At least one LLM API key (OpenAI, HuggingFace, or Anthropic)

## 5-Minute Setup

### 1. Install Dependencies

```bash
npm install
```

### 2. Configure Environment

```bash
cp .env.example .env
```

Edit `.env` and add your API keys:

```env
OPENAI_API_KEY=sk-your-openai-key
API_KEYS=demo-key-1,my-secret-key
```

### 3. Start the Server

```bash
npm run dev
```

The API will be available at `http://localhost:8000`

### 4. Test the API

```bash
curl http://localhost:8000/health
```

Expected response:
```json
{
  "status": "healthy",
  "version": "1.0.0",
  "services": [...],
  "uptime_seconds": 5
}
```

### 5. Make Your First Request

```bash
curl -X POST http://localhost:8000/ai/chat \
  -H "Authorization: Bearer demo-key-1" \
  -H "Content-Type: application/json" \
  -d '{
    "conversation": [
      {"role": "user", "content": "Hello!"}
    ]
  }'
```

## Example Requests

### Chat
```bash
curl -X POST http://localhost:8000/ai/chat \
  -H "Authorization: Bearer demo-key-1" \
  -H "Content-Type: application/json" \
  -d '{"conversation": [{"role": "user", "content": "What is AI?"}]}'
```

### RAG Query
```bash
curl -X POST http://localhost:8000/rag/query \
  -H "Authorization: Bearer demo-key-1" \
  -H "Content-Type: application/json" \
  -d '{"query": "What are the key features?", "top_k": 5}'
```

### Image Generation
```bash
curl -X POST http://localhost:8000/image/generate \
  -H "Authorization: Bearer demo-key-1" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "A sunset over mountains", "size": "1024x1024"}'
```

## What Each Component Does

### πŸ” **Authentication (`/backend/utils/auth.ts`)**
- Validates API keys from the Authorization header
- Implements role-based access (default, premium, admin)
- Used by all protected endpoints

### ⚑ **Rate Limiting (`/backend/utils/rate_limit.ts`)**
- Token bucket algorithm
- Configurable limits per tier (60/300/1000 requests/min)
- Automatic reset after 1 minute
- Prevents abuse and cost overruns

### πŸ€– **AI Service (`/backend/services/ai_service.ts`)**
- Multi-provider LLM routing (OpenAI, HuggingFace, Anthropic)
- Automatic model selection and fallback
- Chat completions with context management
- Embedding generation for RAG

### πŸ“š **RAG Service (`/backend/services/rag_service.ts`)**
- Vector-based document retrieval
- Automatic context injection into prompts
- Supports Pinecone or in-memory vector DB
- Returns sources with similarity scores

### πŸ–ΌοΈ **Image Service (`/backend/services/image_service.ts`)**
- Text-to-image generation
- Supports DALL-E and Stable Diffusion
- Configurable sizes and quality
- Returns base64 or URLs

### πŸŽ™οΈ **Voice Service (`/backend/services/voice_service.ts`)**
- Text-to-speech synthesis (TTS)
- Speech-to-text transcription (STT)
- Multiple voice options
- Various audio formats (mp3, opus, etc.)

### πŸ“„ **Document Service (`/backend/services/document_service.ts`)**
- Upload PDF, DOCX, TXT files
- Automatic text extraction
- Chunking with overlap for better retrieval
- Background processing with workers
- Stores chunks in vector DB

### πŸ”Œ **Adapters**

#### **OpenAI Adapter (`/backend/adapters/openai_adapter.ts`)**
- Chat completions (GPT-4, GPT-3.5)
- Embeddings (text-embedding-ada-002)
- Image generation (DALL-E)
- Voice synthesis and transcription
- Implements LLMAdapter, ImageAdapter, VoiceAdapter interfaces

#### **HuggingFace Adapter (`/backend/adapters/huggingface_adapter.ts`)**
- Open-source models (Mistral, Llama, etc.)
- Stable Diffusion for images
- Sentence transformers for embeddings
- Free tier available

#### **Anthropic Adapter (`/backend/adapters/anthropic_adapter.ts`)**
- Claude models (Sonnet, Opus)
- Advanced reasoning capabilities
- Long context windows

#### **Vector DB Adapters (`/backend/adapters/vector_db_adapter.ts`)**
- **PineconeAdapter**: Production vector storage with managed scaling
- **InMemoryVectorDB**: Development fallback with cosine similarity
- Supports metadata filtering and batch operations

### πŸ“Š **Observability**

#### **Logger (`/backend/utils/logger.ts`)**
- Structured JSON logging
- Configurable log levels (debug, info, warn, error)
- Automatic timestamping
- Production-ready format

#### **Metrics (`/backend/utils/metrics.ts`)**
- Request counting by endpoint
- Error tracking
- Response time measurement
- Model usage statistics
- Vector DB query counts
- Document processing stats

### πŸ”„ **Background Workers (`/backend/workers/ingestion_worker.ts`)**
- Async document processing
- Configurable concurrency
- Job status tracking
- Webhook notifications on completion
- Automatic retries on failure

### 🌐 **API Endpoints**

All endpoints are in `/backend/api/`:

#### **Health & Metrics (`health.ts`)**
- `GET /health` - Service health with component status
- `GET /metrics` - Usage metrics and statistics

#### **Authentication (`auth.ts`)**
- `POST /auth/verify` - Validate API key

#### **Chat (`chat.ts`)**
- `POST /ai/chat` - Multi-turn conversation
- `GET /ai/query` - Simple Q&A

#### **RAG (`rag.ts`)**
- `POST /rag/query` - Query with retrieval
- `GET /rag/models` - List available models

#### **Images (`image.ts`)**
- `POST /image/generate` - Generate images

#### **Voice (`voice.ts`)**
- `POST /voice/synthesize` - Text to speech
- `POST /voice/transcribe` - Speech to text

#### **Documents (`documents.ts`)**
- `POST /upload` - Upload document
- `GET /docs/:id/sources` - Get document chunks
- `POST /webhook/events` - Processing webhooks

## Architecture Flow

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Client  β”‚
β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜
     β”‚
     β”œβ”€ Authorization Header (Bearer token)
     ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Auth Middleware β”‚ ← Validates API key
β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
     β”œβ”€ Checks rate limit
     ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ API Endpoint β”‚ ← Routes request
β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
     β”œβ”€ POST /ai/chat β†’ AI Service
     β”œβ”€ POST /rag/query β†’ RAG Service β†’ Vector DB β†’ AI Service
     β”œβ”€ POST /image/generate β†’ Image Service
     β”œβ”€ POST /voice/synthesize β†’ Voice Service
     β”œβ”€ POST /upload β†’ Document Service β†’ Worker β†’ Vector DB
     ↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Response  β”‚ ← JSON with data + metadata
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

## Configuration

### Environment Variables

| Variable | What It Does | Example |
|----------|-------------|---------|
| `OPENAI_API_KEY` | OpenAI access for GPT models | `sk-...` |
| `HUGGINGFACE_API_KEY` | HuggingFace models access | `hf_...` |
| `API_KEYS` | Valid API keys (comma-separated) | `key1,key2` |
| `RATE_LIMIT_DEFAULT` | Requests/min for basic users | `60` |
| `RATE_LIMIT_ADMIN` | Requests/min for admins | `1000` |
| `MAX_FILE_SIZE_MB` | Max document upload size | `10` |
| `CHUNK_SIZE` | Text chunk size for RAG | `1000` |
| `LOG_LEVEL` | Logging verbosity | `info` |

### Tier System

- **Default**: 60 requests/min
- **Premium**: 300 requests/min (add to config)
- **Admin**: 1000 requests/min (via `ADMIN_API_KEYS`)

## Testing

Run tests:
```bash
npm test
```

Run with coverage:
```bash
npm run test:coverage
```

## Production Checklist

- [ ] Set strong `API_KEYS`
- [ ] Configure `ADMIN_API_KEYS` separately
- [ ] Set up Pinecone for vector storage
- [ ] Increase rate limits based on needs
- [ ] Enable background workers
- [ ] Set `LOG_LEVEL=info` or `warn`
- [ ] Configure CORS origins
- [ ] Set up monitoring/alerting
- [ ] Review cost limits on LLM providers

## Troubleshooting

**"No LLM adapter available"**
β†’ Add at least one API key (OPENAI_API_KEY, HUGGINGFACE_API_KEY, or ANTHROPIC_API_KEY)

**"Invalid API key"**
β†’ Check Authorization header: `Bearer your-key-here`

**"Rate limit exceeded"**
β†’ Wait 60 seconds or use admin key

**Vector DB queries fail**
β†’ Service falls back to in-memory storage automatically

## Next Steps

1. **Read the full README**: `README.md`
2. **Check deployment guide**: `DEPLOYMENT.md`
3. **Review examples**: `examples/js_client.js` and `examples/curl.sh`
4. **Run tests**: `npm test`
5. **Deploy to production**: See DEPLOYMENT.md

## Support

- GitHub Issues
- Documentation in `/docs`
- Example code in `/examples`

Enjoy building with the AI API Service! πŸš€