Spaces:
Sleeping
Sleeping
| # Contributing | |
| Thanks for wanting to contribute! This repository uses a strict CI and formatting policy to keep code consistent, with special emphasis on memory-efficient development for cloud deployment. | |
| ## π§ Memory-Constrained Development Guidelines | |
| This project is optimized for deployment on Render's free tier (512MB RAM limit). All contributions must consider memory usage as a primary constraint. | |
| ### Memory Development Principles | |
| 1. **Memory-First Design**: Consider memory impact of every code change | |
| 2. **Lazy Loading**: Initialize services only when needed | |
| 3. **Resource Cleanup**: Always clean up resources in finally blocks or context managers | |
| 4. **Memory Testing**: Test changes in memory-constrained environments | |
| 5. **Monitoring Integration**: Add memory tracking to new services | |
| ### Memory-Aware Code Guidelines | |
| **β DO - Memory Efficient Patterns:** | |
| ```python | |
| # Use context managers for resource cleanup | |
| from src.utils.memory_utils import MemoryManager | |
| with MemoryManager() as mem: | |
| # Memory-intensive operations | |
| embeddings = process_large_dataset(data) | |
| # Automatic cleanup on exit | |
| # Implement lazy loading for expensive services | |
| @lru_cache(maxsize=1) | |
| def get_expensive_service(): | |
| return ExpensiveService() # Only created once | |
| # Use generators for large data processing | |
| def process_documents(documents): | |
| for doc in documents: | |
| yield process_single_document(doc) # Memory efficient iteration | |
| ``` | |
| **β DON'T - Memory Wasteful Patterns:** | |
| ```python | |
| # Don't load all data into memory at once | |
| all_embeddings = [embed(doc) for doc in all_documents] # Memory spike | |
| # Don't create multiple instances of expensive services | |
| service1 = ExpensiveMLModel() | |
| service2 = ExpensiveMLModel() # Duplicates memory usage | |
| # Don't keep large objects in global scope | |
| GLOBAL_LARGE_DATA = load_entire_dataset() # Always consumes memory | |
| ``` | |
| ## π οΈ Recommended Local Setup | |
| We recommend using `pyenv` + `venv` to create a reproducible development environment. A helper script `dev-setup.sh` is included to automate the steps: | |
| ```bash | |
| # Run the helper script (default Python version can be overridden) | |
| ./dev-setup.sh 3.11.4 | |
| source venv/bin/activate | |
| # Install pre-commit hooks | |
| pip install -r dev-requirements.txt | |
| pre-commit install | |
| ``` | |
| ### Memory-Constrained Testing Environment | |
| **Test your changes in a memory-limited environment:** | |
| ```bash | |
| # Limit Python process memory to simulate Render constraints (macOS/Linux) | |
| ulimit -v 524288 # 512MB limit in KB | |
| # Run your development server | |
| flask run | |
| # Test memory usage | |
| curl http://localhost:5000/health | jq '.memory_usage_mb' | |
| ``` | |
| ## π§ͺ Development Workflow | |
| ### Before Opening a PR | |
| **Required Checks:** | |
| 1. **Code Quality**: `make format` and `make ci-check` | |
| 2. **Test Suite**: `pytest` (all 138 tests must pass) | |
| 3. **Pre-commit**: `pre-commit run --all-files` | |
| 4. **Memory Testing**: Verify memory usage stays within limits | |
| **Memory-Specific Testing:** | |
| ```bash | |
| # Test memory usage during development | |
| python -c " | |
| from src.app_factory import create_app | |
| from src.utils.memory_utils import MemoryManager | |
| app = create_app() | |
| with app.app_context(): | |
| mem = MemoryManager() | |
| print(f'App startup memory: {mem.get_memory_usage():.1f}MB') | |
| # Should be ~50MB or less | |
| " | |
| # Test first request memory loading | |
| curl -X POST http://localhost:5000/chat -H "Content-Type: application/json" \ | |
| -d '{"message": "test"}' && \ | |
| curl http://localhost:5000/health | jq '.memory_usage_mb' | |
| # Should be ~200MB or less | |
| ``` | |
| ### Memory Optimization Development Process | |
| 1. **Profile Before Changes**: Measure baseline memory usage | |
| 2. **Implement Changes**: Follow memory-efficient patterns | |
| 3. **Profile After Changes**: Verify memory impact is acceptable | |
| 4. **Load Test**: Validate performance under memory constraints | |
| 5. **Document Changes**: Update memory-related documentation | |
| ### New Feature Development Guidelines | |
| **When Adding New ML Services:** | |
| ```python | |
| # Example: Adding a new ML service with memory management | |
| class NewMLService: | |
| def __init__(self): | |
| self._model = None # Lazy loading | |
| @property | |
| def model(self): | |
| if self._model is None: | |
| with MemoryManager() as mem: | |
| logger.info(f"Loading model, current memory: {mem.get_memory_usage():.1f}MB") | |
| self._model = load_expensive_model() | |
| logger.info(f"Model loaded, current memory: {mem.get_memory_usage():.1f}MB") | |
| return self._model | |
| def process(self, data): | |
| # Use the lazily-loaded model | |
| return self.model.predict(data) | |
| ``` | |
| **Memory Testing for New Features:** | |
| ```python | |
| # Add to your test file | |
| def test_new_feature_memory_usage(): | |
| """Test that new feature doesn't exceed memory limits""" | |
| import psutil | |
| import os | |
| # Measure before | |
| process = psutil.Process(os.getpid()) | |
| memory_before = process.memory_info().rss / 1024 / 1024 # MB | |
| # Execute new feature | |
| result = your_new_feature() | |
| # Measure after | |
| memory_after = process.memory_info().rss / 1024 / 1024 # MB | |
| memory_increase = memory_after - memory_before | |
| # Assert memory increase is reasonable | |
| assert memory_increase < 50, f"Memory increase {memory_increase:.1f}MB exceeds 50MB limit" | |
| assert memory_after < 300, f"Total memory {memory_after:.1f}MB exceeds 300MB limit" | |
| ``` | |
| ## π§ CI Expectations | |
| **Automated Checks:** | |
| - **Code Quality**: Pre-commit hooks (black, isort, flake8) | |
| - **Test Suite**: All 138 tests must pass | |
| - **Memory Validation**: Memory usage checks during CI | |
| - **Performance Regression**: Response time validation | |
| - **Python Version**: Enforces Python >=3.10 | |
| **Memory-Specific CI Checks:** | |
| ```bash | |
| # CI pipeline includes memory validation | |
| pytest tests/test_memory_constraints.py # Memory usage tests | |
| pytest tests/test_performance.py # Response time validation | |
| pytest tests/test_resource_cleanup.py # Resource leak detection | |
| ``` | |
| ## π Deployment Considerations | |
| ### Render Platform Constraints | |
| **Resource Limits:** | |
| - **RAM**: 512MB total (200MB steady state, 312MB headroom) | |
| - **CPU**: 0.1 vCPU (I/O bound workload) | |
| - **Storage**: 1GB (current usage ~100MB) | |
| - **Network**: Unmetered (external API calls) | |
| **Performance Requirements:** | |
| - **Startup Time**: <30 seconds (lazy loading) | |
| - **Response Time**: <3 seconds for chat requests | |
| - **Memory Stability**: No memory leaks over 24+ hours | |
| - **Concurrent Users**: Support 20-30 simultaneous requests | |
| ### Production Testing | |
| **Before Production Deployment:** | |
| ```bash | |
| # Test with production configuration | |
| export FLASK_ENV=production | |
| gunicorn -c gunicorn.conf.py app:app & | |
| # Load test with memory monitoring | |
| artillery run load-test.yml # Simulate concurrent users | |
| curl http://localhost:5000/health | jq '.memory_usage_mb' | |
| # Memory leak detection (run for 1+ hours) | |
| while true; do | |
| curl -s http://localhost:5000/health | jq '.memory_usage_mb' | |
| sleep 300 # Check every 5 minutes | |
| done | |
| ``` | |
| ## π Additional Resources | |
| ### Memory Optimization References | |
| - **[Memory Utils Documentation](./src/utils/memory_utils.py)**: Comprehensive memory management utilities | |
| - **[App Factory Pattern](./src/app_factory.py)**: Lazy loading implementation | |
| - **[Gunicorn Configuration](./gunicorn.conf.py)**: Production server optimization | |
| - **[Design Documentation](./design-and-evaluation.md)**: Memory architecture decisions | |
| ### Development Tools | |
| ```bash | |
| # Memory profiling during development | |
| pip install memory-profiler | |
| python -m memory_profiler your_script.py | |
| # Real-time memory monitoring | |
| pip install psutil | |
| python -c " | |
| import psutil | |
| process = psutil.Process() | |
| print(f'Memory: {process.memory_info().rss / 1024 / 1024:.1f}MB') | |
| " | |
| ``` | |
| ## π― Code Review Guidelines | |
| ### Memory-Focused Code Review | |
| **Review Checklist:** | |
| - [ ] Does the code follow lazy loading patterns? | |
| - [ ] Are expensive resources properly cleaned up? | |
| - [ ] Is memory usage tested and validated? | |
| - [ ] Are there any potential memory leaks? | |
| - [ ] Does the change impact startup memory? | |
| - [ ] Is caching used appropriately? | |
| **Memory Review Questions:** | |
| 1. "What is the memory impact of this change?" | |
| 2. "Could this cause a memory leak in long-running processes?" | |
| 3. "Is this resource initialized only when needed?" | |
| 4. "Are all expensive objects properly cleaned up?" | |
| 5. "How does this scale with concurrent users?" | |
| Thank you for contributing to memory-efficient, production-ready RAG development! Please open issues or PRs against `main` and follow these memory-conscious development practices. | |