Files changed (1) hide show
  1. README.md +72 -60
README.md CHANGED
@@ -1,61 +1,73 @@
1
- ---
2
- library_name: peft
3
- license: apache-2.0
4
- base_model: Qwen/Qwen2.5-1.5B-Instruct
5
- tags:
6
- - llama-factory
7
- - generated_from_trainer
8
- model-index:
9
- - name: Qwen2.5-1.5B-Instruct-drill
10
- results: []
11
- datasets:
12
- - agentlans/drill
13
- language:
14
- - en
15
- ---
16
-
17
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
18
- should probably proofread and complete it, then remove this comment. -->
19
-
20
- # Qwen2.5-1.5B-Instruct-drill
21
-
22
- This model is a fine-tuned version of [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) on the [agentlans/drill](https://huggingface.co/datasets/agentlans/drill) dataset.
23
-
24
- ## Model description
25
-
26
- More information needed
27
-
28
- ## Intended uses & limitations
29
-
30
- More information needed
31
-
32
- ## Training and evaluation data
33
-
34
- More information needed
35
-
36
- ## Training procedure
37
-
38
- ### Training hyperparameters
39
-
40
- The following hyperparameters were used during training:
41
- - learning_rate: 5e-05
42
- - train_batch_size: 2
43
- - eval_batch_size: 8
44
- - seed: 42
45
- - gradient_accumulation_steps: 8
46
- - total_train_batch_size: 16
47
- - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
48
- - lr_scheduler_type: cosine
49
- - num_epochs: 1.0
50
-
51
- ### Training results
52
-
53
-
54
-
55
- ### Framework versions
56
-
57
- - PEFT 0.15.0
58
- - Transformers 4.49.0
59
- - Pytorch 2.6.0+cu124
60
- - Datasets 3.4.1
 
 
 
 
 
 
 
 
 
 
 
 
61
  - Tokenizers 0.21.0
 
1
+ ---
2
+ library_name: peft
3
+ license: apache-2.0
4
+ base_model: Qwen/Qwen2.5-1.5B-Instruct
5
+ tags:
6
+ - llama-factory
7
+ - generated_from_trainer
8
+ datasets:
9
+ - agentlans/drill
10
+ language:
11
+ - zho
12
+ - eng
13
+ - fra
14
+ - spa
15
+ - por
16
+ - deu
17
+ - ita
18
+ - rus
19
+ - jpn
20
+ - kor
21
+ - vie
22
+ - tha
23
+ - ara
24
+ model-index:
25
+ - name: Qwen2.5-1.5B-Instruct-drill
26
+ results: []
27
+ ---
28
+
29
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
30
+ should probably proofread and complete it, then remove this comment. -->
31
+
32
+ # Qwen2.5-1.5B-Instruct-drill
33
+
34
+ This model is a fine-tuned version of [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) on the [agentlans/drill](https://huggingface.co/datasets/agentlans/drill) dataset.
35
+
36
+ ## Model description
37
+
38
+ More information needed
39
+
40
+ ## Intended uses & limitations
41
+
42
+ More information needed
43
+
44
+ ## Training and evaluation data
45
+
46
+ More information needed
47
+
48
+ ## Training procedure
49
+
50
+ ### Training hyperparameters
51
+
52
+ The following hyperparameters were used during training:
53
+ - learning_rate: 5e-05
54
+ - train_batch_size: 2
55
+ - eval_batch_size: 8
56
+ - seed: 42
57
+ - gradient_accumulation_steps: 8
58
+ - total_train_batch_size: 16
59
+ - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
60
+ - lr_scheduler_type: cosine
61
+ - num_epochs: 1.0
62
+
63
+ ### Training results
64
+
65
+
66
+
67
+ ### Framework versions
68
+
69
+ - PEFT 0.15.0
70
+ - Transformers 4.49.0
71
+ - Pytorch 2.6.0+cu124
72
+ - Datasets 3.4.1
73
  - Tokenizers 0.21.0