Spaces:

sethmcknight
/

msse-ai-engineering

Sleeping

App Files Files Community

msse-ai-engineering / project-plan.md

sethmcknight

Add initial project files including README, .gitignore, and project documentation

2d9ce15 2 months ago

preview code

raw

history blame

5.19 kB

	# RAG Application Project Plan

	This plan outlines the steps to design, build, and deploy a Retrieval-Augmented Generation (RAG) application as per the project requirements, with a focus on achieving a grade of 5. The approach prioritizes early deployment and continuous integration, following Test-Driven Development (TDD) principles.

	## 1. Foundational Setup

	- [x] Repository: Create a new GitHub repository.
	- [x] Virtual Environment: Set up a local Python virtual environment (`venv`).
	- [x] Initial Files:
	- Create `requirements.txt` with initial dependencies (`Flask`, `pytest`).
	- Create a `.gitignore` file for Python.
	- Create a `README.md` with initial setup instructions.
	- Create placeholder files: `deployed.md` and `design-and-evaluation.md`.
	- [x] Testing Framework: Establish a `tests/` directory and configure `pytest`.

	## 2. "Hello World" Deployment

	- [ ] Minimal App: Develop a minimal Flask application (`app.py`) with a `/health` endpoint that returns a JSON status object.
	- [ ] Unit Test: Write a test for the `/health` endpoint to ensure it returns a `200 OK` status and the correct JSON payload.
	- [ ] Local Validation: Run the app and tests locally to confirm everything works.

	## 3. CI/CD and Initial Deployment

	- [ ] Render Setup: Create a new Web Service on Render and link it to the GitHub repository.
	- [ ] Environment Configuration: Configure necessary environment variables on Render (e.g., `PYTHON_VERSION`).
	- [ ] GitHub Actions: Create a CI/CD workflow (`.github/workflows/main.yml`) that:
	- Triggers on push/PR to the `main` branch.
	- Installs dependencies from `requirements.txt`.
	- Runs the `pytest` test suite.
	- On success, triggers a deployment to Render.
	- [ ] Deployment Validation: Push a change and verify that the workflow runs successfully and the application is deployed.
	- [ ] Documentation: Update `deployed.md` with the live URL of the deployed application.

	## 4. Data Ingestion and Processing

	- [ ] Corpus Assembly: Collect or generate 5-20 policy documents (PDF, TXT, MD) and place them in a `corpus/` directory.
	- [ ] Parsing Logic: Implement and test functions to parse different document formats.
	- [ ] Chunking Strategy: Implement and test a document chunking strategy (e.g., recursive character splitting with overlap).
	- [ ] Reproducibility: Set fixed seeds for any processes involving randomness (e.g., chunking, sampling) to ensure deterministic outcomes.

	## 5. Embedding and Vector Storage

	- [ ] Vector DB Setup: Integrate a vector database (e.g., ChromaDB) into the project.
	- [ ] Embedding Model: Select and integrate a free embedding model (e.g., from HuggingFace).
	- [ ] Ingestion Pipeline: Create a script (`ingest.py`) that:
	- Loads documents from the corpus.
	- Chunks the documents.
	- Embeds the chunks.
	- Stores the embeddings in the vector database.
	- [ ] Testing: Write tests to verify each step of the ingestion pipeline.

	## 6. RAG Core Implementation

	- [ ] Retrieval Logic: Implement a function to retrieve the top-k relevant document chunks from the vector store based on a user query.
	- [ ] Prompt Engineering: Design a prompt template that injects the retrieved context into the query for the LLM.
	- [ ] LLM Integration: Connect to a free-tier LLM (e.g., via OpenRouter or Groq) to generate answers.
	- [ ] Guardrails: Implement and test guardrails:
	- Refuse to answer questions outside the corpus.
	- Limit the length of the generated output.
	- Ensure all answers cite the source document IDs/titles.

	## 7. Web Application Completion

	- [ ] Chat Interface: Implement a simple web chat interface for the `/` endpoint.
	- [ ] API Endpoint: Create the `/chat` API endpoint that receives user questions (POST) and returns model-generated answers with citations and snippets.
	- [ ] UI/UX: Ensure the web interface is clean, user-friendly, and handles loading/error states gracefully.
	- [ ] Testing: Write end-to-end tests for the chat functionality.

	## 8. Evaluation

	- [ ] Evaluation Set: Create an evaluation set of 15-30 questions and corresponding "gold" answers covering various policy topics.
	- [ ] Metric Implementation: Develop scripts to calculate:
	- Answer Quality: Groundedness and Citation Accuracy.
	- System Metrics: Latency (p50/p95).
	- [ ] Execution: Run the evaluation and record the results.
	- [ ] Documentation: Summarize the evaluation results in `design-and-evaluation.md`.

	## 9. Final Documentation and Submission

	- [ ] Design Document: Complete `design-and-evaluation.md`, justifying all major design choices (embedding model, chunking strategy, vector store, LLM, etc.).
	- [ ] README: Finalize the `README.md` with comprehensive setup, run, and testing instructions.
	- [ ] Demonstration Video: Record a 5-10 minute screen-share video demonstrating the deployed application, walking through the code architecture, explaining the evaluation results, and showing a successful CI/CD run.
	- [ ] Submission: Share the GitHub repository with the grader and submit the repository and video links.

	# RAG Application Project Plan

	This plan outlines the steps to design, build, and deploy a Retrieval-Augmented Generation (RAG) application as per the project requirements, with a focus on achieving a grade of 5. The approach prioritizes early deployment and continuous integration, following Test-Driven Development (TDD) principles.

	## 1. Foundational Setup

	- [x] Repository: Create a new GitHub repository.
	- [x] Virtual Environment: Set up a local Python virtual environment (`venv`).
	- [x] Initial Files:
	- Create `requirements.txt` with initial dependencies (`Flask`, `pytest`).
	- Create a `.gitignore` file for Python.
	- Create a `README.md` with initial setup instructions.
	- Create placeholder files: `deployed.md` and `design-and-evaluation.md`.
	- [x] Testing Framework: Establish a `tests/` directory and configure `pytest`.

	## 2. "Hello World" Deployment

	- [ ] Minimal App: Develop a minimal Flask application (`app.py`) with a `/health` endpoint that returns a JSON status object.
	- [ ] Unit Test: Write a test for the `/health` endpoint to ensure it returns a `200 OK` status and the correct JSON payload.
	- [ ] Local Validation: Run the app and tests locally to confirm everything works.

	## 3. CI/CD and Initial Deployment

	- [ ] Render Setup: Create a new Web Service on Render and link it to the GitHub repository.
	- [ ] Environment Configuration: Configure necessary environment variables on Render (e.g., `PYTHON_VERSION`).
	- [ ] GitHub Actions: Create a CI/CD workflow (`.github/workflows/main.yml`) that:
	- Triggers on push/PR to the `main` branch.
	- Installs dependencies from `requirements.txt`.
	- Runs the `pytest` test suite.
	- On success, triggers a deployment to Render.
	- [ ] Deployment Validation: Push a change and verify that the workflow runs successfully and the application is deployed.
	- [ ] Documentation: Update `deployed.md` with the live URL of the deployed application.

	## 4. Data Ingestion and Processing

	- [ ] Corpus Assembly: Collect or generate 5-20 policy documents (PDF, TXT, MD) and place them in a `corpus/` directory.
	- [ ] Parsing Logic: Implement and test functions to parse different document formats.
	- [ ] Chunking Strategy: Implement and test a document chunking strategy (e.g., recursive character splitting with overlap).
	- [ ] Reproducibility: Set fixed seeds for any processes involving randomness (e.g., chunking, sampling) to ensure deterministic outcomes.

	## 5. Embedding and Vector Storage

	- [ ] Vector DB Setup: Integrate a vector database (e.g., ChromaDB) into the project.
	- [ ] Embedding Model: Select and integrate a free embedding model (e.g., from HuggingFace).
	- [ ] Ingestion Pipeline: Create a script (`ingest.py`) that:
	- Loads documents from the corpus.
	- Chunks the documents.
	- Embeds the chunks.
	- Stores the embeddings in the vector database.
	- [ ] Testing: Write tests to verify each step of the ingestion pipeline.

	## 6. RAG Core Implementation

	- [ ] Retrieval Logic: Implement a function to retrieve the top-k relevant document chunks from the vector store based on a user query.
	- [ ] Prompt Engineering: Design a prompt template that injects the retrieved context into the query for the LLM.
	- [ ] LLM Integration: Connect to a free-tier LLM (e.g., via OpenRouter or Groq) to generate answers.
	- [ ] Guardrails: Implement and test guardrails:
	- Refuse to answer questions outside the corpus.
	- Limit the length of the generated output.
	- Ensure all answers cite the source document IDs/titles.

	## 7. Web Application Completion

	- [ ] Chat Interface: Implement a simple web chat interface for the `/` endpoint.
	- [ ] API Endpoint: Create the `/chat` API endpoint that receives user questions (POST) and returns model-generated answers with citations and snippets.
	- [ ] UI/UX: Ensure the web interface is clean, user-friendly, and handles loading/error states gracefully.
	- [ ] Testing: Write end-to-end tests for the chat functionality.

	## 8. Evaluation

	- [ ] Evaluation Set: Create an evaluation set of 15-30 questions and corresponding "gold" answers covering various policy topics.
	- [ ] Metric Implementation: Develop scripts to calculate:
	- Answer Quality: Groundedness and Citation Accuracy.
	- System Metrics: Latency (p50/p95).
	- [ ] Execution: Run the evaluation and record the results.
	- [ ] Documentation: Summarize the evaluation results in `design-and-evaluation.md`.

	## 9. Final Documentation and Submission

	- [ ] Design Document: Complete `design-and-evaluation.md`, justifying all major design choices (embedding model, chunking strategy, vector store, LLM, etc.).
	- [ ] README: Finalize the `README.md` with comprehensive setup, run, and testing instructions.
	- [ ] Demonstration Video: Record a 5-10 minute screen-share video demonstrating the deployed application, walking through the code architecture, explaining the evaluation results, and showing a successful CI/CD run.
	- [ ] Submission: Share the GitHub repository with the grader and submit the repository and video links.