File size: 2,897 Bytes
6114bd2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ba9792d
 
6114bd2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57ebe8f
 
 
 
 
 
6114bd2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9e93605
 
6114bd2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
---
license: apache-2.0
datasets:
- tatsu-lab/alpaca
language:
- en
base_model:
- distilbert/distilgpt2
pipeline_tag: text-generation
library_name: transformers
tags:
- chatbot
- instruction-tuning
- distilgpt2
- alpaca
- transformers
- fine-tuned
- offline
- flask
---
# 🧠 My Fine-Tuned Local Chatbot

A locally hosted AI chatbot powered by a fine-tuned **DistilGPT2** model using Hugging Face Transformers. This chatbot is trained on the **Stanford Alpaca Instruction Dataset**, enabling it to follow instructions and provide helpful responses β€” all without relying on internet access.

---

## πŸš€ Features

- πŸ”’ **Fully local** – no internet required after setup  
- 🧠 **Fine-tuned on Stanford Alpaca-style instructions**  
- ⚑ **Fast inference** with CUDA or CPU fallback  
- 🌐 **Flask API** with simple HTML/CSS/JavaScript frontend  
- 🎨 **Customizable prompts** and response formatting  
- 🧾 **Chat history** saved using SQLite
- Dataset Used: [Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca)

---

## πŸ“‚ Project Structure

```
β”œβ”€β”€ app.py # Flask API backend
β”œβ”€β”€ train.py # Script for fine-tuning the model
β”œβ”€β”€ chatbot_model/
β”‚ └── trained_model_* # Your fine-tuned model directory
β”œβ”€β”€ static/
β”‚ β”œβ”€β”€ styles.css # Frontend styles
β”‚ └── script.js
β”œβ”€β”€ templates/
β”‚ └── index.html # Web UI
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ README.md # You are here!
β”œβ”€β”€ download.py
β”œβ”€β”€ preprocess.py
β”œβ”€β”€ int_db.py
β”œβ”€β”€ chat_history.db # Saves history of chats
└── processed_dataset.csv
```

---

## Demo 

![image/png](https://cdn-uploads.huggingface.co/production/uploads/686acf52adf0f81a9ece24c7/vqXOBxgQKlMwSMEiatZlK.png)

---

## πŸ’‘ Sample Prompt

> **Human**: What is the capital of France?  
> **Assistant**: The capital of France is Paris.

---

## πŸ‹οΈβ€β™€οΈ Training
```
python train.py
```

This will:

Download the Stanford Alpaca dataset

Fine-tune distilgpt2

Save it inside:
chatbot_model/trained_model_YYYYMMDD_HHMMSS/

---

## πŸ–₯️ Run the App
```
python app.py
```
Then visit: http://localhost:5005

---

## ❓ FAQ
Q: Does this work offline?
βœ… Yes! Once the model is fine-tuned, no internet is needed.

Q: Can I run it on CPU?
βœ… Yes, but it will be slower. A CUDA GPU is recommended for faster responses.

Q: Can I replace the model?
βœ… Yes! You can fine-tune any Hugging Face-compatible model by modifying train.py.

---

## πŸ› οΈ Tech Stack
- Flask – Web server backend
- Transformers – Hugging Face inference
- PyTorch – Deep learning engine
- HTML/CSS/JavaScript – Frontend
- Stanford Alpaca Dataset
- SQLite – For saving chat history
- Python

---

## πŸ“œ License
MIT License – Free to use, modify, and share.

---

## πŸ‘©β€πŸ’» Author
Jasleen Kaur Matharoo  
πŸ“§ [email protected]  
🌐 GitHub @Jasleen-05