Spaces:
Build error
Build error
| # Quick Start Guide | |
| Get your AI API Service up and running in 5 minutes! | |
| ## Prerequisites | |
| - Node.js 18+ | |
| - npm or yarn | |
| - At least one LLM API key (OpenAI, HuggingFace, or Anthropic) | |
| ## 5-Minute Setup | |
| ### 1. Install Dependencies | |
| ```bash | |
| npm install | |
| ``` | |
| ### 2. Configure Environment | |
| ```bash | |
| cp .env.example .env | |
| ``` | |
| Edit `.env` and add your API keys: | |
| ```env | |
| OPENAI_API_KEY=sk-your-openai-key | |
| API_KEYS=demo-key-1,my-secret-key | |
| ``` | |
| ### 3. Start the Server | |
| ```bash | |
| npm run dev | |
| ``` | |
| The API will be available at `http://localhost:8000` | |
| ### 4. Test the API | |
| ```bash | |
| curl http://localhost:8000/health | |
| ``` | |
| Expected response: | |
| ```json | |
| { | |
| "status": "healthy", | |
| "version": "1.0.0", | |
| "services": [...], | |
| "uptime_seconds": 5 | |
| } | |
| ``` | |
| ### 5. Make Your First Request | |
| ```bash | |
| curl -X POST http://localhost:8000/ai/chat \ | |
| -H "Authorization: Bearer demo-key-1" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{ | |
| "conversation": [ | |
| {"role": "user", "content": "Hello!"} | |
| ] | |
| }' | |
| ``` | |
| ## Example Requests | |
| ### Chat | |
| ```bash | |
| curl -X POST http://localhost:8000/ai/chat \ | |
| -H "Authorization: Bearer demo-key-1" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{"conversation": [{"role": "user", "content": "What is AI?"}]}' | |
| ``` | |
| ### RAG Query | |
| ```bash | |
| curl -X POST http://localhost:8000/rag/query \ | |
| -H "Authorization: Bearer demo-key-1" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{"query": "What are the key features?", "top_k": 5}' | |
| ``` | |
| ### Image Generation | |
| ```bash | |
| curl -X POST http://localhost:8000/image/generate \ | |
| -H "Authorization: Bearer demo-key-1" \ | |
| -H "Content-Type: application/json" \ | |
| -d '{"prompt": "A sunset over mountains", "size": "1024x1024"}' | |
| ``` | |
| ## What Each Component Does | |
| ### π **Authentication (`/backend/utils/auth.ts`)** | |
| - Validates API keys from the Authorization header | |
| - Implements role-based access (default, premium, admin) | |
| - Used by all protected endpoints | |
| ### β‘ **Rate Limiting (`/backend/utils/rate_limit.ts`)** | |
| - Token bucket algorithm | |
| - Configurable limits per tier (60/300/1000 requests/min) | |
| - Automatic reset after 1 minute | |
| - Prevents abuse and cost overruns | |
| ### π€ **AI Service (`/backend/services/ai_service.ts`)** | |
| - Multi-provider LLM routing (OpenAI, HuggingFace, Anthropic) | |
| - Automatic model selection and fallback | |
| - Chat completions with context management | |
| - Embedding generation for RAG | |
| ### π **RAG Service (`/backend/services/rag_service.ts`)** | |
| - Vector-based document retrieval | |
| - Automatic context injection into prompts | |
| - Supports Pinecone or in-memory vector DB | |
| - Returns sources with similarity scores | |
| ### πΌοΈ **Image Service (`/backend/services/image_service.ts`)** | |
| - Text-to-image generation | |
| - Supports DALL-E and Stable Diffusion | |
| - Configurable sizes and quality | |
| - Returns base64 or URLs | |
| ### ποΈ **Voice Service (`/backend/services/voice_service.ts`)** | |
| - Text-to-speech synthesis (TTS) | |
| - Speech-to-text transcription (STT) | |
| - Multiple voice options | |
| - Various audio formats (mp3, opus, etc.) | |
| ### π **Document Service (`/backend/services/document_service.ts`)** | |
| - Upload PDF, DOCX, TXT files | |
| - Automatic text extraction | |
| - Chunking with overlap for better retrieval | |
| - Background processing with workers | |
| - Stores chunks in vector DB | |
| ### π **Adapters** | |
| #### **OpenAI Adapter (`/backend/adapters/openai_adapter.ts`)** | |
| - Chat completions (GPT-4, GPT-3.5) | |
| - Embeddings (text-embedding-ada-002) | |
| - Image generation (DALL-E) | |
| - Voice synthesis and transcription | |
| - Implements LLMAdapter, ImageAdapter, VoiceAdapter interfaces | |
| #### **HuggingFace Adapter (`/backend/adapters/huggingface_adapter.ts`)** | |
| - Open-source models (Mistral, Llama, etc.) | |
| - Stable Diffusion for images | |
| - Sentence transformers for embeddings | |
| - Free tier available | |
| #### **Anthropic Adapter (`/backend/adapters/anthropic_adapter.ts`)** | |
| - Claude models (Sonnet, Opus) | |
| - Advanced reasoning capabilities | |
| - Long context windows | |
| #### **Vector DB Adapters (`/backend/adapters/vector_db_adapter.ts`)** | |
| - **PineconeAdapter**: Production vector storage with managed scaling | |
| - **InMemoryVectorDB**: Development fallback with cosine similarity | |
| - Supports metadata filtering and batch operations | |
| ### π **Observability** | |
| #### **Logger (`/backend/utils/logger.ts`)** | |
| - Structured JSON logging | |
| - Configurable log levels (debug, info, warn, error) | |
| - Automatic timestamping | |
| - Production-ready format | |
| #### **Metrics (`/backend/utils/metrics.ts`)** | |
| - Request counting by endpoint | |
| - Error tracking | |
| - Response time measurement | |
| - Model usage statistics | |
| - Vector DB query counts | |
| - Document processing stats | |
| ### π **Background Workers (`/backend/workers/ingestion_worker.ts`)** | |
| - Async document processing | |
| - Configurable concurrency | |
| - Job status tracking | |
| - Webhook notifications on completion | |
| - Automatic retries on failure | |
| ### π **API Endpoints** | |
| All endpoints are in `/backend/api/`: | |
| #### **Health & Metrics (`health.ts`)** | |
| - `GET /health` - Service health with component status | |
| - `GET /metrics` - Usage metrics and statistics | |
| #### **Authentication (`auth.ts`)** | |
| - `POST /auth/verify` - Validate API key | |
| #### **Chat (`chat.ts`)** | |
| - `POST /ai/chat` - Multi-turn conversation | |
| - `GET /ai/query` - Simple Q&A | |
| #### **RAG (`rag.ts`)** | |
| - `POST /rag/query` - Query with retrieval | |
| - `GET /rag/models` - List available models | |
| #### **Images (`image.ts`)** | |
| - `POST /image/generate` - Generate images | |
| #### **Voice (`voice.ts`)** | |
| - `POST /voice/synthesize` - Text to speech | |
| - `POST /voice/transcribe` - Speech to text | |
| #### **Documents (`documents.ts`)** | |
| - `POST /upload` - Upload document | |
| - `GET /docs/:id/sources` - Get document chunks | |
| - `POST /webhook/events` - Processing webhooks | |
| ## Architecture Flow | |
| ``` | |
| βββββββββββ | |
| β Client β | |
| ββββββ¬βββββ | |
| β | |
| ββ Authorization Header (Bearer token) | |
| β | |
| βββββββββββββββββββ | |
| β Auth Middleware β β Validates API key | |
| ββββββ¬βββββββββββββ | |
| ββ Checks rate limit | |
| β | |
| ββββββββββββββββ | |
| β API Endpoint β β Routes request | |
| ββββββ¬ββββββββββ | |
| ββ POST /ai/chat β AI Service | |
| ββ POST /rag/query β RAG Service β Vector DB β AI Service | |
| ββ POST /image/generate β Image Service | |
| ββ POST /voice/synthesize β Voice Service | |
| ββ POST /upload β Document Service β Worker β Vector DB | |
| β | |
| βββββββββββββ | |
| β Response β β JSON with data + metadata | |
| βββββββββββββ | |
| ``` | |
| ## Configuration | |
| ### Environment Variables | |
| | Variable | What It Does | Example | | |
| |----------|-------------|---------| | |
| | `OPENAI_API_KEY` | OpenAI access for GPT models | `sk-...` | | |
| | `HUGGINGFACE_API_KEY` | HuggingFace models access | `hf_...` | | |
| | `API_KEYS` | Valid API keys (comma-separated) | `key1,key2` | | |
| | `RATE_LIMIT_DEFAULT` | Requests/min for basic users | `60` | | |
| | `RATE_LIMIT_ADMIN` | Requests/min for admins | `1000` | | |
| | `MAX_FILE_SIZE_MB` | Max document upload size | `10` | | |
| | `CHUNK_SIZE` | Text chunk size for RAG | `1000` | | |
| | `LOG_LEVEL` | Logging verbosity | `info` | | |
| ### Tier System | |
| - **Default**: 60 requests/min | |
| - **Premium**: 300 requests/min (add to config) | |
| - **Admin**: 1000 requests/min (via `ADMIN_API_KEYS`) | |
| ## Testing | |
| Run tests: | |
| ```bash | |
| npm test | |
| ``` | |
| Run with coverage: | |
| ```bash | |
| npm run test:coverage | |
| ``` | |
| ## Production Checklist | |
| - [ ] Set strong `API_KEYS` | |
| - [ ] Configure `ADMIN_API_KEYS` separately | |
| - [ ] Set up Pinecone for vector storage | |
| - [ ] Increase rate limits based on needs | |
| - [ ] Enable background workers | |
| - [ ] Set `LOG_LEVEL=info` or `warn` | |
| - [ ] Configure CORS origins | |
| - [ ] Set up monitoring/alerting | |
| - [ ] Review cost limits on LLM providers | |
| ## Troubleshooting | |
| **"No LLM adapter available"** | |
| β Add at least one API key (OPENAI_API_KEY, HUGGINGFACE_API_KEY, or ANTHROPIC_API_KEY) | |
| **"Invalid API key"** | |
| β Check Authorization header: `Bearer your-key-here` | |
| **"Rate limit exceeded"** | |
| β Wait 60 seconds or use admin key | |
| **Vector DB queries fail** | |
| β Service falls back to in-memory storage automatically | |
| ## Next Steps | |
| 1. **Read the full README**: `README.md` | |
| 2. **Check deployment guide**: `DEPLOYMENT.md` | |
| 3. **Review examples**: `examples/js_client.js` and `examples/curl.sh` | |
| 4. **Run tests**: `npm test` | |
| 5. **Deploy to production**: See DEPLOYMENT.md | |
| ## Support | |
| - GitHub Issues | |
| - Documentation in `/docs` | |
| - Example code in `/examples` | |
| Enjoy building with the AI API Service! π | |