# 🔧 Troubleshooting Guide ## Common Issues & Solutions ### 1. Setup Issues #### "ModuleNotFoundError: No module named 'streamlit'" **Problem**: Dependencies not installed **Solution**: ```bash source .venv/bin/activate pip install -r requirements.txt ``` #### "python3: command not found" **Problem**: Python not installed or not in PATH **Solution**: ```bash # Install Python 3.8+ # macOS: brew install python3 # Ubuntu/Debian: sudo apt install python3 # Windows: Download from python.org # Verify: python3 --version ``` #### "virtualenv not found" **Problem**: venv module missing **Solution**: ```bash # Install it: # macOS: brew install python3-venv # Ubuntu: sudo apt install python3-venv # Then recreate venv: python3 -m venv .venv ``` --- ### 2. Dataset Building Issues #### "No article URLs found" **Problem**: Website structure changed or connection failed **Solution**: ```bash # Check internet connection ping community.sap.com # Try rebuilding with debug python tools/build_dataset.py # Check if data directory exists ls -la data/ ``` #### "Connection timeout" **Problem**: Website taking too long to respond **Solution**: ```bash # Modify timeout in tools/build_dataset.py: # Change: timeout=10 # To: timeout=30 # Or add delay import time time.sleep(5) # Between requests ``` #### "Permission denied" error **Problem**: Can't write to data directory **Solution**: ```bash # Fix permissions mkdir -p data chmod 755 data/ # Or run with sudo (not recommended) sudo python tools/build_dataset.py ``` --- ### 3. Embeddings/Index Issues #### "ModuleNotFoundError: No module named 'faiss'" **Problem**: FAISS not installed correctly **Solution**: ```bash pip uninstall faiss-cpu pip install faiss-cpu --no-cache-dir # Or use GPU version if available: # pip install faiss-gpu ``` #### "CUDA error" / "GPU not found" **Problem**: GPU version installed but no GPU available **Solution**: ```bash # Use CPU version instead pip uninstall faiss-gpu pip install faiss-cpu ``` #### "MemoryError during embeddings" **Problem**: System ran out of memory **Solution**: ```python # In tools/embeddings.py, reduce batch size: # Change: batch_size=32 # To: batch_size=8 or 4 # Or use smaller model: # Change: model_name="all-MiniLM-L6-v2" # To: model_name="sentence-transformers/all-MiniLM-L12-v2" ``` #### "Index not found" error **Problem**: RAG index not built **Solution**: ```bash # Rebuild the index python tools/embeddings.py # Verify files exist ls -la data/rag_index.faiss ls -la data/rag_metadata.pkl ``` --- ### 4. LLM Provider Issues #### Ollama **"ConnectionRefusedError: [Errno 111] Connection refused"** ```bash # Ollama server not running # Start it in a new terminal: ollama serve # Or use nohup to background it: nohup ollama serve & ``` **"Model not found"** ```bash # Pull the model first: ollama pull mistral # Or ollama pull neural-chat ollama pull dolphin-mixtral # List available models: ollama list ``` **"Out of memory"** ```bash # Use smaller model: ollama pull neural-chat # 3B instead of 7B # Or configure in config.py: DEFAULT_MODEL = "neural-chat" ``` #### Replicate **"REPLICATE_API_TOKEN not set"** ```bash # Set token in terminal: export REPLICATE_API_TOKEN="your_token_here" # Or add to .env: REPLICATE_API_TOKEN=your_token_here # Verify: echo $REPLICATE_API_TOKEN ``` **"401 Unauthorized"** ```bash # Token is invalid or expired # 1. Get new token from https://replicate.com/account # 2. Update environment variable # 3. Try again ``` **"Rate limit exceeded"** ```bash # Wait a bit, then try again # Or use Ollama/HuggingFace instead ``` #### HuggingFace **"HF_API_TOKEN not set"** ```bash # Set token: export HF_API_TOKEN="your_token_here" # Or add to .env: HF_API_TOKEN=your_token_here # Verify: echo $HF_API_TOKEN ``` **"Model not found" on HuggingFace** ```bash # Verify model ID exists: # Go to https://huggingface.co/models # Find a text-generation model # Example: mistralai/Mistral-7B-Instruct-v0.1 # Update config: LLM_MODEL="mistralai/Mistral-7B-Instruct-v0.1" ``` --- ### 5. Streamlit Issues #### "streamlit: command not found" **Problem**: Streamlit not installed **Solution**: ```bash source .venv/bin/activate pip install streamlit>=1.28.0 ``` #### Port 8501 already in use **Problem**: Another app using port 8501 **Solution**: ```bash # Use different port: streamlit run app.py --server.port 8502 # Or kill the process using 8501: lsof -i :8501 # See what's using it kill -9 # Kill it ``` #### "Cache resource initialization failed" **Problem**: Session state issue **Solution**: ```bash # Clear Streamlit cache: rm -rf ~/.streamlit/cache/ # Restart the app: streamlit run app.py ``` #### App not responding / frozen **Problem**: Long-running operation blocking UI **Solution**: ```bash # Wait for current operation to complete # Or restart: # 1. Press Ctrl+C # 2. Run: streamlit run app.py again ``` --- ### 6. Runtime Issues #### "Empty search results" **Problem**: No relevant documents found **Solution**: ```bash # 1. Verify dataset exists: ls -la data/sap_dataset.json # 2. Verify index exists: ls -la data/rag_index.faiss # 3. Try a different query: # "SAP Basis administration" instead of "help" # 4. Rebuild dataset: python tools/build_dataset.py python tools/embeddings.py ``` #### "Very slow responses" **Problem**: LLM taking too long **Solution**: ```python # Use faster model in config.py: DEFAULT_MODEL = "neural-chat" # 3B is 2-3x faster # Or use cloud provider (usually faster): LLM_PROVIDER = "replicate" ``` #### "Inaccurate or irrelevant answers" **Problem**: RAG not finding good sources or LLM quality **Solution**: ```python # 1. Improve RAG: # In config.py, increase sources: RAG_TOP_K = 10 # From 5 # 2. Use better embeddings: EMBEDDINGS_MODEL = "all-mpnet-base-v2" # Better quality # 3. Use better LLM: DEFAULT_MODEL = "mistral" # From neural-chat # 4. Rebuild index: python tools/embeddings.py ``` #### "API rate limit exceeded" **Problem**: Using cloud provider too frequently **Solution**: ```bash # 1. Wait a bit # 2. Use Ollama (no rate limits) # 3. Or try different cloud provider ``` --- ### 7. Configuration Issues #### "Settings not taking effect" **Problem**: Configuration changes not applied **Solution**: ```bash # 1. Make sure you edited the right file: cat .env # 2. Restart the app: # Ctrl+C and run again # 3. Clear cache: rm -rf ~/.streamlit/cache/ streamlit run app.py ``` #### "Environment variables not loading" **Problem**: .env file not being read **Solution**: ```python # Verify in app.py or config.py: # from dotenv import load_dotenv # load_dotenv() # Must be called # Or set manually: export VAR_NAME="value" streamlit run app.py ``` --- ### 8. Performance Issues #### "High CPU usage" **Problem**: Embeddings or search consuming CPU **Solution**: ```python # Use batch processing in embeddings.py: # Already optimized with batch_size=32 # Or use pre-built index (don't rebuild often) ``` #### "High memory usage" **Problem**: Large dataset or model in memory **Solution**: ```python # Use lighter model in config.py: EMBEDDINGS_MODEL = "all-MiniLM-L6-v2" # Reduce chunk size: RAG_CHUNK_SIZE = 256 # From 512 # Use Ollama 3B model: ollama pull neural-chat ``` #### "Slow search" **Problem**: FAISS search taking too long **Solution**: ```python # Should be fast already, but: # 1. Reduce results: RAG_TOP_K = 3 # From 5 # 2. Check if index is corrupted: # Rebuild it: python tools/embeddings.py ``` --- ### 9. Deployment Issues #### Streamlit Cloud deployment fails **Problem**: Missing secrets or dependencies **Solution**: ```bash # 1. Add secrets in Streamlit Cloud: # Settings → Secrets # LLM_PROVIDER=replicate # REPLICATE_API_TOKEN=xxx # 2. Make sure requirements.txt is in repo # 3. Commit data files or download on deploy # 4. Check build logs: # Deploy → Manage app → Logs ``` #### Docker container issues **Problem**: Can't build or run Docker image **Solution**: ```bash # Create Dockerfile (if not exists) # Build: docker build -t sap-chatbot . # Run: docker run -p 8501:8501 sap-chatbot # Or provide Docker guide ``` --- ### 10. Data Issues #### "Dataset is outdated" **Problem**: Knowledge base needs refresh **Solution**: ```bash # Rebuild dataset: rm data/sap_dataset.json python tools/build_dataset.py python tools/embeddings.py # Takes 10-15 minutes but gets latest content ``` #### "Too much data (slow startup)" **Problem**: Large dataset causing slow startup **Solution**: ```python # Limit dataset in build_dataset.py: # Change: for repo in repos (all repos) # To: for repo in repos[:10] (first 10 only) # Or reduce sources scraped ``` #### "Data format error" **Problem**: JSON file corrupted **Solution**: ```bash # Verify JSON: python -c "import json; json.load(open('data/sap_dataset.json'))" # If error, rebuild: rm data/sap_dataset.json python tools/build_dataset.py ``` --- ## Quick Diagnosis ### System Check Script ```bash #!/bin/bash echo "SAP Chatbot System Check" echo "========================" echo "" echo "1. Python:" python3 --version echo "" echo "2. Virtual Environment:" if [ -d ".venv" ]; then echo "✅ Exists" else echo "❌ Missing" fi echo "" echo "3. Dependencies:" pip list | grep -E "streamlit|transformers|faiss|ollama" echo "" echo "4. Dataset:" ls -lh data/sap_dataset.json 2>/dev/null || echo "❌ Not found" echo "" echo "5. Index:" ls -lh data/rag_index.faiss 2>/dev/null || echo "❌ Not found" echo "" echo "6. .env file:" [ -f ".env" ] && echo "✅ Exists" || echo "❌ Missing" echo "" echo "7. Ollama:" curl -s http://localhost:11434/ > /dev/null && echo "✅ Running" || echo "❌ Not running" echo "" echo "Check complete!" ``` Save as `check_system.sh` and run: ```bash bash check_system.sh ``` --- ## Getting Help 1. **Check this guide** - Most issues documented 2. **Read GETTING_STARTED.md** - Step-by-step setup 3. **Check README.md** - Architecture & concepts 4. **Check config.py** - All configuration options 5. **Look at code** - Well-commented Python files 6. **Open GitHub issue** - Report bugs with details --- ## Debug Mode Enable debug logging: ```python # In app.py or any module: import logging logging.basicConfig(level=logging.DEBUG) logger = logging.getLogger(__name__) logger.debug("Debug message here") ``` Then run: ```bash streamlit run app.py --logger.level=debug ``` --- **Still stuck?** Check the GitHub issues or create a new one with: - Python version - OS (Windows/Mac/Linux) - Error message (full traceback) - Steps to reproduce - What you've already tried Good luck! 🚀