DeryFerd commited on
Commit
3565d57
·
verified ·
1 Parent(s): 25e9c60

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +144 -45
README.md CHANGED
@@ -1,142 +1,241 @@
1
  ---
 
2
  library_name: transformers
 
3
  tags:
 
4
  - phi-2
 
5
  - code-generation
6
- - math
7
- - reasoning
8
- - gsm8k
9
  - mbpp
 
10
  - finetuned
 
11
  datasets:
 
12
  - google-research-datasets/mbpp
13
- - gsm8k
14
- - meta-math/MATH
15
  language:
 
16
  - en
 
17
  base_model:
 
18
  - microsoft/phi-2
 
19
  pipeline_tag: text-generation
 
20
  ---
21
 
22
- # Model Card for DeryFerd/Qwen-Math-Code-Distill-Phi-2
 
 
 
 
23
 
24
  ## Model Details
25
 
 
 
26
  ### Model Description
27
 
28
- **UPDATE:** This model is a fine-tuned, versatile version of **`microsoft/phi-2`**, adapted for both **Python code generation** and **step-by-step mathematical reasoning**. The goal of this project was to distill the capabilities of larger "teacher" models (`Qwen2.5-Coder-7B-Instruct` for coding and `Qwen2.5-Math-7B-Instruct` for math) into the compact and efficient Phi-2 architecture.
29
 
30
- The model was trained on a combined dataset of Python programming problems (from MBPP) and grade-school math word problems (from GSM8K and MATH). It is designed to generate not just answers, but also the thought process behind them, mimicking the style of its teachers.
 
 
 
 
 
 
 
31
 
32
  - **Developed by:** DeryFerd
 
33
  - **Model type:** Causal Language Model
 
34
  - **Language(s) (NLP):** English
 
35
  - **License:** MIT
 
36
  - **Finetuned from model:** `microsoft/phi-2`
37
 
 
 
38
  ### Model Sources
39
 
40
- - **Repository:** [https://huggingface.co/DeryFerd/Qwen-Math-Code-Distill-Phi-2](https://huggingface.co/DeryFerd/Qwen-Math-Code-Distill-Phi-2)
 
 
 
 
41
 
42
  ## Uses
43
 
 
 
44
  ### Direct Use
45
 
46
- This model is intended for direct use in generating Python functions from natural language and solving math word problems with step-by-step explanations. It can be used as a coding/math assistant, for educational purposes, or for rapid prototyping.
47
 
48
- **Intended Use:**
49
- * Generating Python functions from docstrings or natural language instructions.
50
- * Solving math problems while showing the reasoning process.
 
 
 
 
 
51
 
52
  ### Out-of-Scope Use
53
 
54
- This is a specialized model. It will not perform well on tasks outside of basic Python code and grade-school level math, such as general conversation, translation, or creative writing. It has not been trained or evaluated for safety and may produce incorrect or insecure code, as well as flawed mathematical reasoning.
 
 
 
 
55
 
56
  ## Bias, Risks, and Limitations
57
 
58
- This model was trained on the MBPP, GSM8K, and MATH datasets. Its capabilities are limited to these domains. The model may generate code that is syntactically correct but logically flawed, or math solutions that seem logical but contain calculation errors. **Always review and test the generated output before use in production environments.**
 
 
 
 
59
 
60
  A notable limitation discovered during development is a potential **low-level GPU memory conflict**. When this model is loaded into the same runtime as a significantly larger and architecturally different model (like Qwen 7B), its fine-tuned capabilities can be silently overridden, causing it to revert to the base model's behavior. It is recommended to run this model in an isolated process.
61
 
 
 
62
  ## How to Get Started with the Model
63
 
 
 
64
  Use the code below to get started with the model using the `transformers` library.
65
 
 
 
66
  ```python
 
67
  from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer
68
 
69
- model_id = "DeryFerd/Qwen-Math-Code-Distill-Phi-2"
 
 
 
 
70
 
71
  # Load the tokenizer and model
 
72
  tokenizer = AutoTokenizer.from_pretrained(model_id)
 
73
  model = AutoModelForCausalLM.from_pretrained(
74
- model_id,
75
- torch_dtype="auto",
76
- device_map="auto",
77
- trust_remote_code=True
 
 
 
 
 
78
  )
79
 
 
 
80
  # Create a text-generation pipeline
 
81
  pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
82
 
83
- # --- Example 1: Coding ---
84
- code_instruction = "Write a Python function that takes a list of strings and returns a new list with all strings converted to uppercase."
85
- prompt = f"Instruct: {code_instruction.strip()}\nOutput:"
86
 
87
- outputs = pipe(
88
- prompt,
89
- max_new_tokens=256,
90
- do_sample=False,
91
- pad_token_id=tokenizer.eos_token_id
92
- )
93
- response = outputs[0]['generated_text'].split("Output:")[1].strip()
94
- print("--- Coding Example ---")
95
- print(response)
96
 
97
- # --- Example 2: Math ---
98
- math_instruction = "A bakery has 150 cookies. They sell 60 in the morning and 35 in the afternoon. How many cookies are left at the end of the day?"
99
- prompt = f"Instruct: {math_instruction.strip()}\nOutput:"
 
 
 
 
 
 
100
 
101
  outputs = pipe(
102
- prompt,
103
- max_new_tokens=512,
104
- do_sample=False,
105
- pad_token_id=tokenizer.eos_token_id
 
 
 
 
 
106
  )
 
 
 
107
  response = outputs[0]['generated_text'].split("Output:")[1].strip()
108
- print("\n--- Math Example ---")
109
  print(response)
110
 
 
 
 
 
111
  ## Training Details
112
 
 
 
113
  ### Training Data
114
 
115
- The model was fine-tuned on a combined dataset of **3,474 instruction-response pairs**:
116
- - **2,500 math problems:** A mix of 2,000 samples from the GSM8K dataset and 500 samples from the MATH dataset. Generated using `Qwen2.5-Math-7B-Instruct`.
117
- - **974 coding problems:** A curated subset of the MBPP dataset. Generated using `Qwen2.5-Coder-7B-Instruct`.
 
 
118
 
119
  ### Training Procedure
120
 
 
 
121
  The model was fine-tuned using the LoRA (Low-Rank Adaptation) method for parameter-efficient fine-tuning (PEFT).
122
 
 
 
123
  #### Training Hyperparameters
124
 
 
 
125
  - **Framework:** `trl.SFTTrainer`
 
126
  - **LoRA `r`:** 16
 
127
  - **LoRA `alpha`:** 32
 
128
  - **Target Modules:** `q_proj`, `k_proj`, `v_proj`, `dense`
 
129
  - **Learning Rate:** 2e-4
130
- - **LR Scheduler:** Constant
 
 
131
  - **Epochs:** 3
 
132
  - **Batch Size:** 1 (with gradient accumulation of 8)
 
133
  - **Optimizer:** Paged AdamW 8-bit
134
 
 
 
135
  ### Compute Infrastructure
136
 
 
 
137
  - **Hardware Type:** Single NVIDIA T4 GPU
 
138
  - **Cloud Provider:** Kaggle Notebooks
139
 
 
 
140
  ## Citation
141
 
142
- If you use this model, please consider citing the original Phi-2, MBPP, GSM8K, and MATH papers.
 
 
 
1
  ---
2
+
3
  library_name: transformers
4
+
5
  tags:
6
+
7
  - phi-2
8
+
9
  - code-generation
10
+
 
 
11
  - mbpp
12
+
13
  - finetuned
14
+
15
  datasets:
16
+
17
  - google-research-datasets/mbpp
18
+
 
19
  language:
20
+
21
  - en
22
+
23
  base_model:
24
+
25
  - microsoft/phi-2
26
+
27
  pipeline_tag: text-generation
28
+
29
  ---
30
 
31
+
32
+
33
+ # Model Card for DeryFerd/phi2-finetuned-mbpp-clean
34
+
35
+
36
 
37
  ## Model Details
38
 
39
+
40
+
41
  ### Model Description
42
 
 
43
 
44
+
45
+ This model is a fine-tuned version of **`microsoft/phi-2`**, specifically adapted for Python code generation tasks. It was trained on a high-quality, curated subset of the **MBPP (Mostly Basic Python Programming)** dataset.
46
+
47
+
48
+
49
+ The primary goal of this project was to distill the coding style and capabilities of a larger "teacher" model (`Qwen/Qwen2.5-Coder-7B-Instruct`) into the much more compact and efficient Phi-2 architecture. The model is designed to generate Python functions based on natural language instructions, often including explanations and test cases in its output.
50
+
51
+
52
 
53
  - **Developed by:** DeryFerd
54
+
55
  - **Model type:** Causal Language Model
56
+
57
  - **Language(s) (NLP):** English
58
+
59
  - **License:** MIT
60
+
61
  - **Finetuned from model:** `microsoft/phi-2`
62
 
63
+
64
+
65
  ### Model Sources
66
 
67
+
68
+
69
+ - **Repository:** [https://huggingface.co/DeryFerd/phi2-finetuned-mbpp-clean](https://huggingface.co/DeryFerd/phi2-finetuned-mbpp-clean)
70
+
71
+
72
 
73
  ## Uses
74
 
75
+
76
+
77
  ### Direct Use
78
 
 
79
 
80
+
81
+ This model is intended for direct use in generating Python code snippets, particularly for creating standalone functions based on a descriptive prompt. It can be used for educational purposes, as a coding assistant, or for rapid prototyping.
82
+
83
+
84
+
85
+ **Intended Use:** Generating Python functions from docstrings or natural language instructions.
86
+
87
+
88
 
89
  ### Out-of-Scope Use
90
 
91
+
92
+
93
+ This is a specialized model. It will not perform well on tasks outside of Python code generation, such as general conversation, translation, or creative writing. It has not been trained or evaluated for safety and may produce incorrect or insecure code.
94
+
95
+
96
 
97
  ## Bias, Risks, and Limitations
98
 
99
+
100
+
101
+ This model was trained on the MBPP dataset, which consists of basic programming problems. Its capabilities are limited to this domain. The model may generate code that is syntactically correct but logically flawed. **Always review and test the generated code before use in production environments.**
102
+
103
+
104
 
105
  A notable limitation discovered during development is a potential **low-level GPU memory conflict**. When this model is loaded into the same runtime as a significantly larger and architecturally different model (like Qwen 7B), its fine-tuned capabilities can be silently overridden, causing it to revert to the base model's behavior. It is recommended to run this model in an isolated process.
106
 
107
+
108
+
109
  ## How to Get Started with the Model
110
 
111
+
112
+
113
  Use the code below to get started with the model using the `transformers` library.
114
 
115
+
116
+
117
  ```python
118
+
119
  from transformers import pipeline, AutoModelForCausalLM, AutoTokenizer
120
 
121
+
122
+
123
+ model_id = "DeryFerd/phi2-finetuned-mbpp-clean"
124
+
125
+
126
 
127
  # Load the tokenizer and model
128
+
129
  tokenizer = AutoTokenizer.from_pretrained(model_id)
130
+
131
  model = AutoModelForCausalLM.from_pretrained(
132
+
133
+     model_id,
134
+
135
+     torch_dtype="auto",
136
+
137
+     device_map="auto",
138
+
139
+     trust_remote_code=True
140
+
141
  )
142
 
143
+
144
+
145
  # Create a text-generation pipeline
146
+
147
  pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
148
 
 
 
 
149
 
 
 
 
 
 
 
 
 
 
150
 
151
+ # Define your instruction
152
+
153
+ instruction = "Write a Python function that takes a list of strings and returns a new list with all strings converted to uppercase."
154
+
155
+ prompt = f"Instruct: {instruction.strip()}\nOutput:"
156
+
157
+
158
+
159
+ # Generate the response
160
 
161
  outputs = pipe(
162
+
163
+     prompt,
164
+
165
+     max_new_tokens=256,
166
+
167
+     do_sample=False,
168
+
169
+     pad_token_id=tokenizer.eos_token_id
170
+
171
  )
172
+
173
+
174
+
175
  response = outputs[0]['generated_text'].split("Output:")[1].strip()
176
+
177
  print(response)
178
 
179
+ ```
180
+
181
+
182
+
183
  ## Training Details
184
 
185
+
186
+
187
  ### Training Data
188
 
189
+
190
+
191
+ The model was fine-tuned on `mbpp_974_final.jsonl`, a curated dataset containing 974 high-quality instruction-response pairs for Python programming problems, derived from the MBPP dataset. The data was generated using `Qwen/Qwen2.5-Coder-7B-Instruct`.
192
+
193
+
194
 
195
  ### Training Procedure
196
 
197
+
198
+
199
  The model was fine-tuned using the LoRA (Low-Rank Adaptation) method for parameter-efficient fine-tuning (PEFT).
200
 
201
+
202
+
203
  #### Training Hyperparameters
204
 
205
+
206
+
207
  - **Framework:** `trl.SFTTrainer`
208
+
209
  - **LoRA `r`:** 16
210
+
211
  - **LoRA `alpha`:** 32
212
+
213
  - **Target Modules:** `q_proj`, `k_proj`, `v_proj`, `dense`
214
+
215
  - **Learning Rate:** 2e-4
216
+
217
+ - **LR Scheduler:** Cosine
218
+
219
  - **Epochs:** 3
220
+
221
  - **Batch Size:** 1 (with gradient accumulation of 8)
222
+
223
  - **Optimizer:** Paged AdamW 8-bit
224
 
225
+
226
+
227
  ### Compute Infrastructure
228
 
229
+
230
+
231
  - **Hardware Type:** Single NVIDIA T4 GPU
232
+
233
  - **Cloud Provider:** Kaggle Notebooks
234
 
235
+
236
+
237
  ## Citation
238
 
239
+
240
+
241
+ If you use this model, please consider citing the original Phi-2 and MBPP papers.