abc123 / CLAUDE.md
vimalk78's picture
feat: add multi-topic intersection methods with adaptive beta for word selection
b05514b
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Structure
This is a full-stack AI-powered crossword puzzle generator:
- **Python Backend** (`crossword-app/backend-py/`) - Primary implementation with dynamic word generation
- **React Frontend** (`crossword-app/frontend/`) - Modern React app with interactive crossword UI
- **Node.js Backend** (`backend/`) - Legacy implementation (deprecated)
Current deployment uses the Python backend with Docker containerization.
## Development Commands
### Frontend Development
```bash
cd crossword-app/frontend
npm install
npm run dev # Start development server on http://localhost:5173
npm run build # Build for production
npm run preview # Preview production build
```
### Backend Development (Python - Primary)
```bash
cd crossword-app/backend-py
# Testing
python run_tests.py # Run all tests
pytest test-unit/ -v # Run unit tests
pytest test-integration/ -v # Run integration tests
python test_integration_minimal.py # Quick test without ML deps
# Development server
python app.py # Start FastAPI server on port 7860
# Debug/development tools
python test_difficulty_softmax.py # Test difficulty selection
python test_softmax_service.py # Test word selection logic
python test_distribution_normalization.py # Test distribution normalization across topics
```
### Backend Development (Node.js - Legacy)
```bash
cd backend
npm install
npm run dev # Start Express server on http://localhost:3000
npm test # Run tests
```
### Docker Deployment
```bash
# Build and run locally
docker build -t crossword-app .
docker run -p 7860:7860 -e NODE_ENV=production crossword-app
# Test deployment
curl http://localhost:7860/api/topics
curl http://localhost:7860/health
```
### Linting and Type Checking
```bash
# Python backend
cd crossword-app/backend-py
mypy src/ # Type checking (if mypy installed)
ruff src/ # Linting (if ruff installed)
# Frontend
cd crossword-app/frontend
npm run lint # ESLint (if configured)
```
## Architecture Overview
### Full-Stack Components
**Frontend** (`crossword-app/frontend/`)
- React 18 with hooks and functional components
- Key components: `TopicSelector.jsx`, `PuzzleGrid.jsx`, `ClueList.jsx`, `DebugTab.jsx`
- Custom hook: `useCrossword.js` manages API calls and puzzle state
- Interactive crossword grid with cell navigation and solution reveal
- Debug tab for visualizing word selection process (when enabled)
**Python Backend** (`crossword-app/backend-py/` - Primary)
- FastAPI web framework serving both API and static frontend files
- AI-powered dynamic word generation using WordFreq + sentence-transformers
- No static word files - all words generated on-demand from 100K+ vocabulary
- WordNet-based clue generation with semantic definitions
- Comprehensive caching system for models, embeddings, and vocabulary
**Node.js Backend** (`backend/` - Legacy - Deprecated)
- Express.js with static JSON word files
- Original implementation, no longer actively maintained
- Used for comparison and fallback testing only
### Core Python Backend Components
**ThematicWordService** (`src/services/thematic_word_service.py`)
- Core AI-powered word generation engine using WordFreq database (100K+ words)
- Sentence-transformers (all-mpnet-base-v2) for semantic embeddings
- 10-tier frequency classification system with percentile-based difficulty selection
- Temperature-controlled softmax for balanced word selection randomness
- 50% word overgeneration strategy for better crossword grid fitting
- **Multi-topic intersection**: `_compute_multi_topic_similarities()` with vectorized soft minimum, geometric/harmonic means
- **Adaptive beta mechanism**: Automatically adjusts threshold (0.25β†’0.175β†’0.103...) to ensure 15+ word minimum
- **Performance optimized**: 40x speedup through vectorized operations over loop-based approach
- Key method: `generate_thematic_words()` - Returns words with semantic similarity scores and frequency tiers
**CrosswordGenerator** (`src/services/crossword_generator.py`)
- Main crossword generation algorithm using backtracking
- Integrates with ThematicWordService for AI word selection
- Sorts words by crossword suitability before grid placement
- Returns complete puzzle with grid, clues, and optional debug information
**WordNetClueGenerator** (`src/services/wordnet_clue_generator.py`)
- NLTK WordNet-based clue generation using semantic relationships
- Creates contextual crossword clues from word definitions
- Caches generated clues for performance optimization
- Handles multiple word senses and part-of-speech variations
**CrosswordGeneratorWrapper** (`src/services/crossword_generator_wrapper.py`)
- Wrapper service coordinating word generation and grid creation
- Manages integration between ThematicWordService and CrosswordGenerator
- Handles error recovery and fallback strategies
### Data Flow
1. **User Interaction** β†’ React frontend (TopicSelector with topics/custom sentence/difficulty)
2. **API Request** β†’ FastAPI backend (`src/routes/api.py`)
3. **Word Generation** β†’ ThematicWordService (dynamic AI-powered word selection with multi-topic intersection)
4. **Clue Generation** β†’ WordNetClueGenerator (semantic clue creation)
5. **Grid Generation** β†’ CrosswordGenerator backtracking algorithm with word placement
6. **Response** β†’ JSON with grid, clues, metadata, and optional debug data
7. **Frontend Rendering** β†’ Interactive crossword grid with clues and debug visualization
### Critical Dependencies
**Frontend:**
- React 18, Vite (development/build)
- Node.js 18+ and npm 9+
**Python Backend (Primary):**
- FastAPI, uvicorn, pydantic (web framework)
- sentence-transformers, torch (AI word generation)
- wordfreq (vocabulary database)
- nltk (WordNet clue generation)
- scikit-learn (clustering and similarity)
- numpy (embeddings and mathematical operations)
- pytest, pytest-asyncio (testing)
**Node.js Backend (Legacy - Deprecated):**
- Express.js, cors, helmet
- JSON file-based word storage
The application requires AI dependencies for core functionality - no fallback to static word lists.
### API Endpoints
Python backend provides the following REST API:
- `GET /api/topics` - Returns 12 available topics (animals, geography, science, etc.)
- `POST /api/generate` - Generate crossword puzzle with topics/custom sentence/difficulty
- `POST /api/words` - Debug endpoint for testing word generation
- `GET /health` - Health check endpoint with service status
- `GET /api/topic/{topic}/words` - Generate words for specific topic (debug)
### Testing Strategy
**Python Backend Tests:**
- `test-unit/test_crossword_generator.py` - Grid generation logic and backtracking
- `test-unit/test_crossword_generator_wrapper.py` - Service integration testing
- `test-unit/test_api_routes.py` - FastAPI endpoints and request validation
- `test-integration/test_local.py` - End-to-end integration testing
- `test_integration_minimal.py` - Quick functionality test without heavy ML dependencies
**Multi-Topic Testing & Development Scripts:**
- `hack/test_soft_minimum_quick.py` - Quick soft minimum method verification
- `hack/test_optimized_soft_minimum.py` - Performance testing (40x speedup validation)
- `hack/debug_adaptive_beta_bug.py` - Adaptive beta mechanism debugging
- `hack/test_adaptive_fix.py` - Full vocabulary testing with adaptive beta
- `hack/test_simpler_case.py` - Compatible topic testing (animals + nature)
- All hack/ scripts use shared cache-dir for model loading consistency
**Frontend Tests:**
- Component testing with React Testing Library (if configured)
- E2E testing with Playwright/Cypress (if configured)
### Key Architecture Features
**Dynamic Word Generation:**
- No static word files - all words generated dynamically from WordFreq database
- 100K+ vocabulary with crossword-suitable filtering (3-12 letters, alphabetic only)
- AI-powered semantic similarity using sentence-transformers embeddings
- 10-tier frequency classification for difficulty-aware word selection
**Advanced Selection Logic:**
- Temperature-controlled softmax for balanced randomness
- 50% word overgeneration strategy to improve crossword grid fitting success
- Percentile-based difficulty mapping ensures consistent challenge levels
- Multi-theme vs single-theme processing modes for different puzzle styles
**Multi-Topic Intersection Methods:**
- **Soft Minimum (Default)**: Uses `-log(sum(exp(-beta * similarities))) / beta` formula to find words relevant to ALL topics
- **Adaptive Beta Mechanism**: Automatically adjusts beta parameter (10.0 β†’ 7.0 β†’ 4.9...) to ensure minimum word count (15+)
- **Alternative Methods**: geometric_mean, harmonic_mean, averaging for different intersection behaviors
- **Performance Optimized**: Vectorized implementation achieves 40x speedup over loop-based approach
- **Semantic Quality**: Filters problematic words like "ethology", "guns" for Art+Books, promotes true intersections like "literature"
- See `docs/multi_vector_word_finding.md` for detailed experimental analysis and method comparison
**Distribution Normalization:**
- **DISABLED BY DEFAULT** - Analysis shows non-normalized approach is better (see docs/distribution_normalization_analysis.md)
- Available normalization methods: similarity_range, composite_zscore, percentile_recentering
- Can be enabled with `ENABLE_DISTRIBUTION_NORMALIZATION=true` for experimentation
- When enabled, visible in debug tab with before/after comparison tooltips
- Non-normalized approach preserves natural semantic relationships and linguistic authenticity
**Comprehensive Caching:**
- Vocabulary, frequency, and embeddings cached for performance
- WordNet clue caching to avoid redundant semantic lookups
- Model cache shared across service instances
### Environment Configuration
**Python Backend (Production):**
```bash
NODE_ENV=production
PORT=7860
CACHE_DIR=/app/cache
THEMATIC_VOCAB_SIZE_LIMIT=100000
THEMATIC_MODEL_NAME=all-mpnet-base-v2
ENABLE_DEBUG_TAB=true
ENABLE_DISTRIBUTION_NORMALIZATION=false # Default: disabled for better semantic authenticity
PYTHONPATH=/app/crossword-app/backend-py
PYTHONUNBUFFERED=1
```
**Frontend Development:**
```bash
VITE_API_BASE_URL=http://localhost:7860 # Points to Python backend
```
**Key Configuration Options:**
- `CACHE_DIR`: Directory for model cache, embeddings, and vocabulary files
- `THEMATIC_VOCAB_SIZE_LIMIT`: Maximum vocabulary size (default 100K)
- `ENABLE_DEBUG_TAB`: Enable debug visualization in frontend
- `THEMATIC_MODEL_NAME`: Sentence transformer model (default all-mpnet-base-v2)
- `ENABLE_DISTRIBUTION_NORMALIZATION`: Enable distribution normalization (default false - see analysis doc)
- `NORMALIZATION_METHOD`: Normalization method - similarity_range, composite_zscore, percentile_recentering (default similarity_range)
**Multi-Topic Intersection Configuration:**
- `MULTI_TOPIC_METHOD`: Multi-topic intersection method - soft_minimum, geometric_mean, harmonic_mean, averaging (default: soft_minimum)
- `SOFT_MIN_BETA`: Initial beta parameter for soft minimum method (default: 10.0)
- `SOFT_MIN_ADAPTIVE`: Enable adaptive beta mechanism for automatic threshold adjustment (default: true)
- `SOFT_MIN_MIN_WORDS`: Minimum words required before relaxing beta parameter (default: 15)
- `SOFT_MIN_MAX_RETRIES`: Maximum adaptive beta retries before giving up (default: 5)
- `SOFT_MIN_BETA_DECAY`: Beta decay factor per retry attempt (default: 0.7)
### Performance Notes
**Python Backend:**
- **Startup**: ~30-60 seconds (model download + cache creation)
- **Memory**: ~500MB-1GB (sentence-transformers + embeddings + vocabulary)
- **Response Time**: ~200-500ms (word generation + clue creation + grid fitting)
- **Cache Creation**: WordFreq vocabulary + embeddings generation is main startup bottleneck
- **Disk Usage**: ~500MB for full model cache (vocabulary, embeddings, models)
**Frontend:**
- **Development**: Hot reload with Vite (~200ms)
- **Build Time**: ~10-30 seconds for production build
- **Bundle Size**: Optimized with Vite tree-shaking
**Deployment:**
- Docker build time: ~5-10 minutes (includes frontend build + Python deps)
- Container size: ~1.5GB (includes ML models and dependencies)
- Hugging Face Spaces deployment: Automatic on git push
## Implementation Guidelines
### Development Priorities
- **No static word files** - All word/clue generation must be dynamic using AI services
- **No inference API solutions** - Use local model inference for better control and performance
- **Always run unit tests** after fixing bugs to ensure functionality
- **ThematicWordService is primary** - VectorSearchService is deprecated/unused
- **No fallback to static templates** - Application requires AI dependencies for core functionality
### Current Architecture Status
- βœ… **Fully AI-powered**: WordFreq + sentence-transformers + WordNet
- βœ… **Dynamic word generation**: 100K+ vocabulary with semantic filtering
- βœ… **Intelligent difficulty**: Percentile-based frequency classification
- βœ… **Multi-topic intersection**: Soft minimum method with adaptive beta for semantic quality
- βœ… **Performance optimized**: 40x speedup through vectorized operations
- βœ… **Debug visualization**: Optional debug tab for development/analysis
- βœ… **Comprehensive caching**: Models, embeddings, and vocabulary cached for performance
- βœ… **Modern stack**: FastAPI + React with Docker deployment ready
- the cache is present in root cache-dir/ folder. every program in hack folder should use this as the cache-dir for loading sentence transformer models