Spaces:
Build error
Build error
File size: 8,517 Bytes
d61feef |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 |
# Quick Start Guide
Get your AI API Service up and running in 5 minutes!
## Prerequisites
- Node.js 18+
- npm or yarn
- At least one LLM API key (OpenAI, HuggingFace, or Anthropic)
## 5-Minute Setup
### 1. Install Dependencies
```bash
npm install
```
### 2. Configure Environment
```bash
cp .env.example .env
```
Edit `.env` and add your API keys:
```env
OPENAI_API_KEY=sk-your-openai-key
API_KEYS=demo-key-1,my-secret-key
```
### 3. Start the Server
```bash
npm run dev
```
The API will be available at `http://localhost:8000`
### 4. Test the API
```bash
curl http://localhost:8000/health
```
Expected response:
```json
{
"status": "healthy",
"version": "1.0.0",
"services": [...],
"uptime_seconds": 5
}
```
### 5. Make Your First Request
```bash
curl -X POST http://localhost:8000/ai/chat \
-H "Authorization: Bearer demo-key-1" \
-H "Content-Type: application/json" \
-d '{
"conversation": [
{"role": "user", "content": "Hello!"}
]
}'
```
## Example Requests
### Chat
```bash
curl -X POST http://localhost:8000/ai/chat \
-H "Authorization: Bearer demo-key-1" \
-H "Content-Type: application/json" \
-d '{"conversation": [{"role": "user", "content": "What is AI?"}]}'
```
### RAG Query
```bash
curl -X POST http://localhost:8000/rag/query \
-H "Authorization: Bearer demo-key-1" \
-H "Content-Type: application/json" \
-d '{"query": "What are the key features?", "top_k": 5}'
```
### Image Generation
```bash
curl -X POST http://localhost:8000/image/generate \
-H "Authorization: Bearer demo-key-1" \
-H "Content-Type: application/json" \
-d '{"prompt": "A sunset over mountains", "size": "1024x1024"}'
```
## What Each Component Does
### π **Authentication (`/backend/utils/auth.ts`)**
- Validates API keys from the Authorization header
- Implements role-based access (default, premium, admin)
- Used by all protected endpoints
### β‘ **Rate Limiting (`/backend/utils/rate_limit.ts`)**
- Token bucket algorithm
- Configurable limits per tier (60/300/1000 requests/min)
- Automatic reset after 1 minute
- Prevents abuse and cost overruns
### π€ **AI Service (`/backend/services/ai_service.ts`)**
- Multi-provider LLM routing (OpenAI, HuggingFace, Anthropic)
- Automatic model selection and fallback
- Chat completions with context management
- Embedding generation for RAG
### π **RAG Service (`/backend/services/rag_service.ts`)**
- Vector-based document retrieval
- Automatic context injection into prompts
- Supports Pinecone or in-memory vector DB
- Returns sources with similarity scores
### πΌοΈ **Image Service (`/backend/services/image_service.ts`)**
- Text-to-image generation
- Supports DALL-E and Stable Diffusion
- Configurable sizes and quality
- Returns base64 or URLs
### ποΈ **Voice Service (`/backend/services/voice_service.ts`)**
- Text-to-speech synthesis (TTS)
- Speech-to-text transcription (STT)
- Multiple voice options
- Various audio formats (mp3, opus, etc.)
### π **Document Service (`/backend/services/document_service.ts`)**
- Upload PDF, DOCX, TXT files
- Automatic text extraction
- Chunking with overlap for better retrieval
- Background processing with workers
- Stores chunks in vector DB
### π **Adapters**
#### **OpenAI Adapter (`/backend/adapters/openai_adapter.ts`)**
- Chat completions (GPT-4, GPT-3.5)
- Embeddings (text-embedding-ada-002)
- Image generation (DALL-E)
- Voice synthesis and transcription
- Implements LLMAdapter, ImageAdapter, VoiceAdapter interfaces
#### **HuggingFace Adapter (`/backend/adapters/huggingface_adapter.ts`)**
- Open-source models (Mistral, Llama, etc.)
- Stable Diffusion for images
- Sentence transformers for embeddings
- Free tier available
#### **Anthropic Adapter (`/backend/adapters/anthropic_adapter.ts`)**
- Claude models (Sonnet, Opus)
- Advanced reasoning capabilities
- Long context windows
#### **Vector DB Adapters (`/backend/adapters/vector_db_adapter.ts`)**
- **PineconeAdapter**: Production vector storage with managed scaling
- **InMemoryVectorDB**: Development fallback with cosine similarity
- Supports metadata filtering and batch operations
### π **Observability**
#### **Logger (`/backend/utils/logger.ts`)**
- Structured JSON logging
- Configurable log levels (debug, info, warn, error)
- Automatic timestamping
- Production-ready format
#### **Metrics (`/backend/utils/metrics.ts`)**
- Request counting by endpoint
- Error tracking
- Response time measurement
- Model usage statistics
- Vector DB query counts
- Document processing stats
### π **Background Workers (`/backend/workers/ingestion_worker.ts`)**
- Async document processing
- Configurable concurrency
- Job status tracking
- Webhook notifications on completion
- Automatic retries on failure
### π **API Endpoints**
All endpoints are in `/backend/api/`:
#### **Health & Metrics (`health.ts`)**
- `GET /health` - Service health with component status
- `GET /metrics` - Usage metrics and statistics
#### **Authentication (`auth.ts`)**
- `POST /auth/verify` - Validate API key
#### **Chat (`chat.ts`)**
- `POST /ai/chat` - Multi-turn conversation
- `GET /ai/query` - Simple Q&A
#### **RAG (`rag.ts`)**
- `POST /rag/query` - Query with retrieval
- `GET /rag/models` - List available models
#### **Images (`image.ts`)**
- `POST /image/generate` - Generate images
#### **Voice (`voice.ts`)**
- `POST /voice/synthesize` - Text to speech
- `POST /voice/transcribe` - Speech to text
#### **Documents (`documents.ts`)**
- `POST /upload` - Upload document
- `GET /docs/:id/sources` - Get document chunks
- `POST /webhook/events` - Processing webhooks
## Architecture Flow
```
βββββββββββ
β Client β
ββββββ¬βββββ
β
ββ Authorization Header (Bearer token)
β
βββββββββββββββββββ
β Auth Middleware β β Validates API key
ββββββ¬βββββββββββββ
ββ Checks rate limit
β
ββββββββββββββββ
β API Endpoint β β Routes request
ββββββ¬ββββββββββ
ββ POST /ai/chat β AI Service
ββ POST /rag/query β RAG Service β Vector DB β AI Service
ββ POST /image/generate β Image Service
ββ POST /voice/synthesize β Voice Service
ββ POST /upload β Document Service β Worker β Vector DB
β
βββββββββββββ
β Response β β JSON with data + metadata
βββββββββββββ
```
## Configuration
### Environment Variables
| Variable | What It Does | Example |
|----------|-------------|---------|
| `OPENAI_API_KEY` | OpenAI access for GPT models | `sk-...` |
| `HUGGINGFACE_API_KEY` | HuggingFace models access | `hf_...` |
| `API_KEYS` | Valid API keys (comma-separated) | `key1,key2` |
| `RATE_LIMIT_DEFAULT` | Requests/min for basic users | `60` |
| `RATE_LIMIT_ADMIN` | Requests/min for admins | `1000` |
| `MAX_FILE_SIZE_MB` | Max document upload size | `10` |
| `CHUNK_SIZE` | Text chunk size for RAG | `1000` |
| `LOG_LEVEL` | Logging verbosity | `info` |
### Tier System
- **Default**: 60 requests/min
- **Premium**: 300 requests/min (add to config)
- **Admin**: 1000 requests/min (via `ADMIN_API_KEYS`)
## Testing
Run tests:
```bash
npm test
```
Run with coverage:
```bash
npm run test:coverage
```
## Production Checklist
- [ ] Set strong `API_KEYS`
- [ ] Configure `ADMIN_API_KEYS` separately
- [ ] Set up Pinecone for vector storage
- [ ] Increase rate limits based on needs
- [ ] Enable background workers
- [ ] Set `LOG_LEVEL=info` or `warn`
- [ ] Configure CORS origins
- [ ] Set up monitoring/alerting
- [ ] Review cost limits on LLM providers
## Troubleshooting
**"No LLM adapter available"**
β Add at least one API key (OPENAI_API_KEY, HUGGINGFACE_API_KEY, or ANTHROPIC_API_KEY)
**"Invalid API key"**
β Check Authorization header: `Bearer your-key-here`
**"Rate limit exceeded"**
β Wait 60 seconds or use admin key
**Vector DB queries fail**
β Service falls back to in-memory storage automatically
## Next Steps
1. **Read the full README**: `README.md`
2. **Check deployment guide**: `DEPLOYMENT.md`
3. **Review examples**: `examples/js_client.js` and `examples/curl.sh`
4. **Run tests**: `npm test`
5. **Deploy to production**: See DEPLOYMENT.md
## Support
- GitHub Issues
- Documentation in `/docs`
- Example code in `/examples`
Enjoy building with the AI API Service! π
|