pankajmathur commited on
Commit
7913c8c
·
verified ·
1 Parent(s): 2e0af51

Upload reports.md

Browse files
Files changed (1) hide show
  1. report/reports.md +151 -0
report/reports.md ADDED
@@ -0,0 +1,151 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # nanochat training report
2
+
3
+ Generated: 2025-12-08 13:02:11
4
+
5
+ ## Environment
6
+
7
+ ### Git Information
8
+ - Branch: pankaj_dev
9
+ - Commit: 3289b19 (dirty)
10
+ - Message: Adjust device_batch_size in run_d34_finetune.sh from 4 to 6 for mid-training and
11
+
12
+ ### Hardware
13
+ - Platform: Linux
14
+ - CPUs: 240 cores (240 logical)
15
+ - Memory: 1771.7 GB
16
+ - GPUs: 8x NVIDIA A100-SXM4-80GB
17
+ - GPU Memory: 634.0 GB total
18
+ - CUDA Version: 12.8
19
+ - Hourly Rate: $14.32/hour
20
+
21
+ ### Software
22
+ - Python: 3.10.12
23
+ - PyTorch: 2.8.0+cu128
24
+
25
+
26
+ ### Bloat
27
+ - Characters: 446,068
28
+ - Lines: 10,895
29
+ - Files: 53
30
+ - Tokens (approx): 111,517
31
+ - Dependencies (uv.lock lines): 2,218
32
+
33
+ Run started: 2025-12-08 13:02:14
34
+
35
+ ---
36
+
37
+ ## Midtraining
38
+ timestamp: 2025-12-08 04:41:26
39
+
40
+ - run: d34_finetune
41
+ - device_type:
42
+ - dtype: bfloat16
43
+ - num_iterations: -1
44
+ - max_seq_len: 2048
45
+ - device_batch_size: 4
46
+ - unembedding_lr: 0.0040
47
+ - embedding_lr: 0.2000
48
+ - matrix_lr: 0.0200
49
+ - init_lr_frac: 1.0000
50
+ - weight_decay: 0.0000
51
+ - eval_every: 150
52
+ - eval_tokens: 10,485,760
53
+
54
+ - total_batch_size: 524,288
55
+ - dry_run: 0
56
+ - Number of iterations: 810
57
+ - DDP world size: 8
58
+ - Minimum validation bpb: 0.3282
59
+
60
+
61
+ ## Chat evaluation mid
62
+ timestamp: 2025-12-08 05:06:16
63
+
64
+ - source: mid
65
+ - task_name: None
66
+ - dtype: bfloat16
67
+ - temperature: 0.0000
68
+ - max_new_tokens: 512
69
+ - num_samples: 1
70
+ - top_k: 50
71
+ - batch_size: 8
72
+ - model_tag: None
73
+ - step: None
74
+ - max_problems: None
75
+ - device_type:
76
+ - ARC-Easy: 0.6961
77
+ - ARC-Challenge: 0.5367
78
+ - MMLU: 0.4229
79
+ - GSM8K: 0.1137
80
+ - HumanEval: 0.1098
81
+ - SpellingBee: 0.9961
82
+ - ChatCORE metric: 0.4045
83
+
84
+
85
+ ## Chat SFT
86
+ timestamp: 2025-12-08 05:18:08
87
+
88
+ - run: d34_finetune
89
+ - source: mid
90
+ - device_type:
91
+ - dtype: bfloat16
92
+ - device_batch_size: 4
93
+ - num_epochs: 1
94
+ - num_iterations: -1
95
+ - target_examples_per_step: 32
96
+ - unembedding_lr: 0.0040
97
+ - embedding_lr: 0.2000
98
+ - matrix_lr: 0.0200
99
+ - weight_decay: 0.0000
100
+ - init_lr_frac: 0.0200
101
+ - eval_every: 100
102
+ - eval_steps: 100
103
+ - eval_metrics_every: 200
104
+ - eval_metrics_max_problems: 1024
105
+ - Training rows: 22,439
106
+ - Number of iterations: 701
107
+ - Training loss: 0.4230
108
+ - Validation loss: 0.8044
109
+
110
+
111
+ ## Chat evaluation sft
112
+ timestamp: 2025-12-08 05:42:31
113
+
114
+ - source: sft
115
+ - task_name: None
116
+ - dtype: bfloat16
117
+ - temperature: 0.0000
118
+ - max_new_tokens: 512
119
+ - num_samples: 1
120
+ - top_k: 50
121
+ - batch_size: 8
122
+ - model_tag: None
123
+ - step: None
124
+ - max_problems: None
125
+ - device_type:
126
+ - ARC-Easy: 0.7210
127
+ - ARC-Challenge: 0.5418
128
+ - MMLU: 0.4304
129
+ - GSM8K: 0.1327
130
+ - HumanEval: 0.1037
131
+ - SpellingBee: 1.0000
132
+ - ChatCORE metric: 0.4157
133
+
134
+
135
+ ## Summary
136
+
137
+
138
+ - Characters: 440,256
139
+ - Lines: 10,727
140
+ - Files: 52
141
+ - Tokens (approx): 110,064
142
+ - Dependencies (uv.lock lines): 2,218
143
+
144
+ | Metric | BASE | MID | SFT | RL |
145
+ |-----------------|----------|----------|----------|----------|
146
+ | ARC-Challenge | - | 0.5367 | 0.5418 | - |
147
+ | ARC-Easy | - | 0.6961 | 0.7210 | - |
148
+ | GSM8K | - | 0.1137 | 0.1327 | - |
149
+ | HumanEval | - | 0.1098 | 0.1037 | - |
150
+ | MMLU | - | 0.4229 | 0.4304 | - |
151
+ | ChatCORE | - | 0.4045 | 0.4157 | - |