Update README.md
Browse files
README.md
CHANGED
|
@@ -24,7 +24,7 @@ Finetune on [huseinzol05/malaysian-dialect-qa](https://huggingface.co/datasets/h
|
|
| 24 |
## How we train
|
| 25 |
|
| 26 |
1. GRPO full parameters.
|
| 27 |
-
5. WanDB at https://wandb.ai/huseinzol05/fpf-Malaysian-Qwen2.5-7B-Reasoning-SFT-GRPO
|
| 28 |
|
| 29 |
Source code at https://github.com/mesolitica/malaya/blob/master/session/qwen2.5/7b-grpo-fsdp.sh
|
| 30 |
|
|
|
|
| 24 |
## How we train
|
| 25 |
|
| 26 |
1. GRPO full parameters.
|
| 27 |
+
5. WanDB at https://wandb.ai/huseinzol05/fpf-Malaysian-Qwen2.5-7B-Reasoning-SFT-GRPO-v2
|
| 28 |
|
| 29 |
Source code at https://github.com/mesolitica/malaya/blob/master/session/qwen2.5/7b-grpo-fsdp.sh
|
| 30 |
|