Jasleen05 commited on
Commit
6114bd2
Β·
verified Β·
1 Parent(s): 1be89f7

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +125 -0
README.md ADDED
@@ -0,0 +1,125 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - tatsu-lab/alpaca
5
+ language:
6
+ - en
7
+ metrics:
8
+ - perplexity
9
+ base_model:
10
+ - distilbert/distilgpt2
11
+ pipeline_tag: text-generation
12
+ library_name: transformers
13
+ tags:
14
+ - chatbot
15
+ - instruction-tuning
16
+ - distilgpt2
17
+ - alpaca
18
+ - transformers
19
+ - fine-tuned
20
+ - offline
21
+ - flask
22
+ ---
23
+ # 🧠 My Fine-Tuned Local Chatbot
24
+
25
+ A locally hosted AI chatbot powered by a fine-tuned **DistilGPT2** model using Hugging Face Transformers. This chatbot is trained on the **Stanford Alpaca Instruction Dataset**, enabling it to follow instructions and provide helpful responses β€” all without relying on internet access.
26
+
27
+ ---
28
+
29
+ ## πŸš€ Features
30
+
31
+ - πŸ”’ **Fully local** – no internet required after setup
32
+ - 🧠 **Fine-tuned on Stanford Alpaca-style instructions**
33
+ - ⚑ **Fast inference** with CUDA or CPU fallback
34
+ - 🌐 **Flask API** with simple HTML/CSS/JavaScript frontend
35
+ - 🎨 **Customizable prompts** and response formatting
36
+ - 🧾 **Chat history** saved using SQLite
37
+
38
+ ---
39
+
40
+ ## πŸ“‚ Project Structure
41
+
42
+ ```
43
+ β”œβ”€β”€ app.py # Flask API backend
44
+ β”œβ”€β”€ train.py # Script for fine-tuning the model
45
+ β”œβ”€β”€ chatbot_model/
46
+ β”‚ └── trained_model_* # Your fine-tuned model directory
47
+ β”œβ”€β”€ static/
48
+ β”‚ β”œβ”€β”€ styles.css # Frontend styles
49
+ β”‚ └── script.js
50
+ β”œβ”€β”€ templates/
51
+ β”‚ └── index.html # Web UI
52
+ β”œβ”€β”€ requirements.txt
53
+ β”œβ”€β”€ README.md # You are here!
54
+ β”œβ”€β”€ download.py
55
+ β”œβ”€β”€ preprocess.py
56
+ β”œβ”€β”€ int_db.py
57
+ β”œβ”€β”€ chat_history.db # Saves history of chats
58
+ └── processed_dataset.csv
59
+ ```
60
+
61
+ ---
62
+
63
+ ## πŸ’‘ Sample Prompt
64
+
65
+ > **Human**: What is the capital of France?
66
+ > **Assistant**: The capital of France is Paris.
67
+
68
+ ---
69
+
70
+ ## πŸ‹οΈβ€β™€οΈ Training
71
+ ```
72
+ python train.py
73
+ ```
74
+
75
+ This will:
76
+
77
+ Download the Stanford Alpaca dataset
78
+
79
+ Fine-tune distilgpt2
80
+
81
+ Save it inside:
82
+ chatbot_model/trained_model_YYYYMMDD_HHMMSS/
83
+
84
+ ---
85
+
86
+ ## πŸ–₯️ Run the App
87
+ ```
88
+ python app.py
89
+ ```
90
+ Then visit: http://localhost:5005
91
+
92
+ ---
93
+
94
+ ## ❓ FAQ
95
+ Q: Does this work offline?
96
+ βœ… Yes! Once the model is fine-tuned, no internet is needed.
97
+
98
+ Q: Can I run it on CPU?
99
+ βœ… Yes, but it will be slower. A CUDA GPU is recommended for faster responses.
100
+
101
+ Q: Can I replace the model?
102
+ βœ… Yes! You can fine-tune any Hugging Face-compatible model by modifying train.py.
103
+
104
+ ---
105
+
106
+ ## πŸ› οΈ Tech Stack
107
+ - Flask – Web server backend
108
+ - Transformers – Hugging Face inference
109
+ - PyTorch – Deep learning engine
110
+ - HTML/CSS/JavaScript – Frontend
111
+ - Stanford Alpaca Dataset
112
+ - SQLite – For saving chat history
113
+ - Python
114
+
115
+ ---
116
+
117
+ ## πŸ“œ License
118
+ MIT License – Free to use, modify, and share.
119
+
120
+ ---
121
+
122
+ ## πŸ‘©β€πŸ’» Author
123
+ Jasleen Kaur Matharoo
124
+ πŸ“§ [email protected]
125
+ 🌐 GitHub @Jasleen-05