Upload 4 files

Browse files

Files changed (3) hide show

README.md +22 -23
media/english_evaluation.png +2 -2
media/french_evaluation.png +2 -2

README.md CHANGED Viewed

@@ -7,7 +7,7 @@ language:
 - fr
 - en
 base_model:
-- Qwen/Qwen3-0.6B
 pipeline_tag: text-generation
 ---
@@ -15,15 +15,15 @@ pipeline_tag: text-generation
 ---
-# Luth-0.6B-Instruct
-**Luth-0.6B-Instruct** is a French fine-tuned version of [Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B), trained on the [Luth-SFT](https://huggingface.co/datasets/kurakurai/luth-sft) dataset. The model has drastically improved its French capabilities in instruction following, math, and general knowledge. Additionally, its English capabilities have remained stable and have even increased in some areas.
 Our Evaluation, training and data scripts are available on [GitHub](https://github.com/kurakurai/Luth), along with the [Blog](https://huggingface.co/blog/MaxLSB/luth) we wrote.
 ## Model Details
-Luth was trained using full fine-tuning on the Luth-SFT dataset with [Axolotl](https://github.com/axolotl-ai-cloud/axolotl). The resulting model was then merged with the base Qwen3-0.6B model. This process successfully retained the model's English capabilities while improving its performance on nearly all selected benchmarks in both French and English.
 ## Benchmark Results
@@ -41,35 +41,34 @@ We used LightEval for evaluation, with custom tasks for the French benchmarks. T
 ### French Benchmark Scores
-| Benchmark         | Qwen3-0.6B       | Qwen2.5-0.5B-Instruct | Luth-0.6B-Instruct |
-|-------------------|------------------|-----------------------|-----------------|
-| ifeval-fr         | 44.45            | 22.18                 | <u>48.24</u>    |
-| gpqa-diamond-fr   | 28.93            | 23.86                 | <u>33.50</u>    |
-| mmlu-fr           | 27.16            | 35.04                 | <u>40.23</u>    |
-| math-500-fr       | 29.20            | 10.00                 | <u>43.00</u>    |
-| arc-chall-fr      | 31.31            | 28.23                 | <u>33.88</u>    |
-| hellaswag-fr      | 25.11            | <u>51.45</u>          | 45.70           |
 ### English Benchmark Scores
-| Benchmark         | Qwen3-0.6B       | Qwen2.5-0.5B-Instruct | Luth-0.6B-Instruct   |
-|-------------------|------------------|-----------------------|-----------------|
-| ifeval-en         | <u>57.86</u>     | 29.21                 | 53.97           |
-| gpqa-diamond-en   | <u>29.80</u>     | 26.77                 | 28.28           |
-| mmlu-en           | 36.85            | 43.80                 | <u>48.10</u>    |
-| math-500-en       | 45.00            | 31.80                 | <u>47.80</u>    |
-| arc-chall-en      | 33.62            | 32.17                 | <u>35.92</u>    |
-| hellaswag-en      | 42.91            | <u>49.56</u>          | 46.96           |
 ## Citation
 ```bibtex
 @misc{luth2025kurakurai,
-  title   = {Luth-0.6B-Instruct},
   author  = {Kurakura AI Team},
   year    = {2025},
   howpublished = {\url{https://huggingface.co/kurakurai/Luth-0.6B}},
-  note    = {Qwen3-0.6B fine-tuned on French datasets}
 }
 ```

 - fr
 - en
 base_model:
+- Qwen/Qwen3-1.7B
 pipeline_tag: text-generation
 ---
 ---
+# Luth-1.7B-Instruct
+**Luth-1.7B-Instruct** is a French fine-tuned version of [Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B), trained on the [Luth-SFT](https://huggingface.co/datasets/kurakurai/luth-sft) dataset. The model has drastically improved its French capabilities in instruction following, math, and general knowledge. Additionally, its English capabilities have remained stable and have even increased in some areas.
 Our Evaluation, training and data scripts are available on [GitHub](https://github.com/kurakurai/Luth), along with the [Blog](https://huggingface.co/blog/MaxLSB/luth) we wrote.
 ## Model Details
+Luth was trained using full fine-tuning on the Luth-SFT dataset with [Axolotl](https://github.com/axolotl-ai-cloud/axolotl). The resulting model was then merged with the base Qwen3-1.7B model. This process successfully retained the model's English capabilities while improving its performance on most selected benchmarks in both French and English.
 ## Benchmark Results
 ### French Benchmark Scores
+| Benchmark         | Qwen3-1.7B       | SmolLM2-1.7B-Instruct | Qwen2.5-1.5B-Instruct | Luth-1.7B-Instruct   |
+|-------------------|------------------|-----------------------|-----------------------|----------------------|
+| ifeval-fr         | 54.53            | 31.24                 | 32.90                 | <u>57.67</u>         |
+| gpqa-diamond-fr   | 26.90            | 21.83                 | 28.93                 | <u>38.58</u>         |
+| mmlu-fr           | 28.46            | 33.73                 | 46.25                 | <u>49.66</u>         |
+| math-500-fr       | 60.80            | 11.20                 | 32.20                 | <u>64.00</u>         |
+| arc-chall-fr      | 33.28            | 28.57                 | 32.68                 | <u>35.16</u>         |
+| hellaswag-fr      | 24.86            | <u>49.58</u>          | 34.34                 | 31.93                |
 ### English Benchmark Scores
+| Benchmark         | Qwen3-1.7B       | SmolLM2-1.7B-Instruct | Qwen2.5-1.5B-Instruct | Luth-1.7B-Instruct   |
+|-------------------|------------------|-----------------------|-----------------------|----------------------|
+| ifeval-en         | <u>68.39</u>     | 48.24                 | 39.93                 | 65.80                |
+| gpqa-diamond-en   | <u>31.82</u>     | 24.75                 | 30.30                 | 31.82                |
+| mmlu-en           | 52.74            | 50.27                 | 59.81                 | <u>60.19</u>         |
+| math-500-en       | 69.20            | 22.40                 | 56.00                 | <u>70.00</u>         |
+| arc-chall-en      | 36.09            | 42.32                 | 41.04                 | <u>42.24</u>         |
+| hellaswag-en      | 46.96            | <u>66.94</u>          | 64.48                 | 58.55                |
 ## Citation
 ```bibtex
 @misc{luth2025kurakurai,
+  title   = {Luth-1.7B-Instruct},
   author  = {Kurakura AI Team},
   year    = {2025},
   howpublished = {\url{https://huggingface.co/kurakurai/Luth-0.6B}},
+  note    = {Qwen3-1.7B fine-tuned on French datasets}
 }
 ```

media/english_evaluation.png CHANGED Viewed

Git LFS Details

SHA256: 6fb245ae4cc5fc77e61726aca46f5e807a1e48324f08f5edcb10543a51c794dc
Pointer size: 131 Bytes
Size of remote file: 240 kB

Git LFS Details

SHA256: e1e3e819a2e7100058967dc4836702f61796616035de676e67a501ce6c6cd1a2
Pointer size: 131 Bytes
Size of remote file: 238 kB

media/french_evaluation.png CHANGED Viewed

Git LFS Details

SHA256: b270faf3bd7cb2567f05fc08f30886be7cedafb9db6ad4852e32efe21dc4c341
Pointer size: 131 Bytes
Size of remote file: 234 kB

Git LFS Details

SHA256: 8c1e4a7c40575bc9600c3a9b2735b2fdc417f5721c8767ccad3f94ee3e1dc4bf
Pointer size: 131 Bytes
Size of remote file: 238 kB