eaddario commited on
Commit
9619e54
·
verified ·
1 Parent(s): e168f80

Generate Perplexity, KLD, ARC, HellaSwag, MMLU, Truthful QA and WinoGrande scores

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-F16.arc +21 -0
  2. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-F16.hsw +20 -0
  3. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-F16.mmlu +21 -0
  4. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-F16.tqa +21 -0
  5. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-F16.wng +19 -0
  6. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_m.arc +21 -0
  7. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_m.hsw +20 -0
  8. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_m.mmlu +21 -0
  9. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_m.ppx +37 -0
  10. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_m.tqa +21 -0
  11. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_m.wng +19 -0
  12. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_s.arc +21 -0
  13. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_s.hsw +20 -0
  14. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_s.mmlu +21 -0
  15. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_s.ppx +37 -0
  16. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_s.tqa +21 -0
  17. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_s.wng +19 -0
  18. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq4_nl.arc +21 -0
  19. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq4_nl.hsw +20 -0
  20. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq4_nl.mmlu +21 -0
  21. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq4_nl.ppx +37 -0
  22. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq4_nl.tqa +21 -0
  23. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq4_nl.wng +19 -0
  24. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_l.arc +21 -0
  25. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_l.hsw +20 -0
  26. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_l.mmlu +21 -0
  27. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_l.ppx +37 -0
  28. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_l.tqa +21 -0
  29. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_l.wng +19 -0
  30. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_m.arc +21 -0
  31. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_m.hsw +20 -0
  32. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_m.mmlu +21 -0
  33. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_m.ppx +37 -0
  34. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_m.tqa +21 -0
  35. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_m.wng +19 -0
  36. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_s.arc +21 -0
  37. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_s.hsw +20 -0
  38. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_s.mmlu +21 -0
  39. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_s.ppx +37 -0
  40. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_s.tqa +21 -0
  41. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_s.wng +19 -0
  42. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_m.arc +21 -0
  43. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_m.hsw +20 -0
  44. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_m.mmlu +21 -0
  45. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_m.ppx +37 -0
  46. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_m.tqa +21 -0
  47. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_m.wng +19 -0
  48. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_s.arc +21 -0
  49. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_s.hsw +20 -0
  50. scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_s.mmlu +21 -0
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-F16.arc ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 42 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-F16.gguf (version GGUF V3 (latest))
4
+
5
+ Final result: 61.7333 +/- 1.7759
6
+ Random chance: 25.0083 +/- 1.5824
7
+
8
+
9
+ llama_perf_context_print: load time = 40261.05 ms
10
+ llama_perf_context_print: prompt eval time = 149462.93 ms / 36666 tokens ( 4.08 ms per token, 245.32 tokens per second)
11
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
12
+ llama_perf_context_print: total time = 150294.55 ms / 36667 tokens
13
+ ggml_metal_free: deallocating
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
21
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-F16.hsw ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 42 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-F16.gguf (version GGUF V3 (latest))
4
+
5
+ 750 79.73333333% [76.7082%, 82.4554%]
6
+
7
+
8
+ llama_perf_context_print: load time = 2970.89 ms
9
+ llama_perf_context_print: prompt eval time = 528861.23 ms / 129319 tokens ( 4.09 ms per token, 244.52 tokens per second)
10
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
11
+ llama_perf_context_print: total time = 532325.38 ms / 129320 tokens
12
+ ggml_metal_free: deallocating
13
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-F16.mmlu ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 42 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-F16.gguf (version GGUF V3 (latest))
4
+
5
+ Final result: 40.8000 +/- 1.7958
6
+ Random chance: 25.0000 +/- 1.5822
7
+
8
+
9
+ llama_perf_context_print: load time = 2768.82 ms
10
+ llama_perf_context_print: prompt eval time = 276273.33 ms / 68956 tokens ( 4.01 ms per token, 249.59 tokens per second)
11
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
12
+ llama_perf_context_print: total time = 277471.13 ms / 68957 tokens
13
+ ggml_metal_free: deallocating
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
21
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-F16.tqa ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 42 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-F16.gguf (version GGUF V3 (latest))
4
+
5
+ Final result: 37.4667 +/- 1.7686
6
+ Random chance: 19.8992 +/- 1.4588
7
+
8
+
9
+ llama_perf_context_print: load time = 2730.94 ms
10
+ llama_perf_context_print: prompt eval time = 213745.70 ms / 51053 tokens ( 4.19 ms per token, 238.85 tokens per second)
11
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
12
+ llama_perf_context_print: total time = 215254.19 ms / 51054 tokens
13
+ ggml_metal_free: deallocating
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
21
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-F16.wng ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 42 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-F16.gguf (version GGUF V3 (latest))
4
+
5
+ Final Winogrande score(750 tasks): 72.4000 +/- 1.6334
6
+
7
+ llama_perf_context_print: load time = 2841.75 ms
8
+ llama_perf_context_print: prompt eval time = 91265.69 ms / 22541 tokens ( 4.05 ms per token, 246.98 tokens per second)
9
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
10
+ llama_perf_context_print: total time = 91746.92 ms / 22542 tokens
11
+ ggml_metal_free: deallocating
12
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
13
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_m.arc ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ3_M.gguf (version GGUF V3 (latest))
4
+
5
+ Final result: 65.3333 +/- 1.7389
6
+ Random chance: 25.0083 +/- 1.5824
7
+
8
+
9
+ llama_perf_context_print: load time = 4590.43 ms
10
+ llama_perf_context_print: prompt eval time = 152258.69 ms / 36666 tokens ( 4.15 ms per token, 240.81 tokens per second)
11
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
12
+ llama_perf_context_print: total time = 153118.42 ms / 36667 tokens
13
+ ggml_metal_free: deallocating
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
21
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_m.hsw ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ3_M.gguf (version GGUF V3 (latest))
4
+
5
+ 750 80.00000000% [76.9876%, 82.7066%]
6
+
7
+
8
+ llama_perf_context_print: load time = 769.34 ms
9
+ llama_perf_context_print: prompt eval time = 538472.19 ms / 129319 tokens ( 4.16 ms per token, 240.16 tokens per second)
10
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
11
+ llama_perf_context_print: total time = 542167.13 ms / 129320 tokens
12
+ ggml_metal_free: deallocating
13
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_m.mmlu ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ3_M.gguf (version GGUF V3 (latest))
4
+
5
+ Final result: 39.6000 +/- 1.7870
6
+ Random chance: 25.0000 +/- 1.5822
7
+
8
+
9
+ llama_perf_context_print: load time = 806.33 ms
10
+ llama_perf_context_print: prompt eval time = 283405.69 ms / 68956 tokens ( 4.11 ms per token, 243.31 tokens per second)
11
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
12
+ llama_perf_context_print: total time = 284720.03 ms / 68957 tokens
13
+ ggml_metal_free: deallocating
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
21
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_m.ppx ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ====== Perplexity statistics ======
2
+ Mean PPL(Q) : 25.645736 ± 0.214748
3
+ Mean PPL(base) : 20.457257 ± 0.161275
4
+ Cor(ln(PPL(Q)), ln(PPL(base))): 95.06%
5
+ Mean ln(PPL(Q)/PPL(base)) : 0.226040 ± 0.002601
6
+ Mean PPL(Q)/PPL(base) : 1.253625 ± 0.003260
7
+ Mean PPL(Q)-PPL(base) : 5.188479 ± 0.079257
8
+
9
+ ====== KL divergence statistics ======
10
+ Mean KLD: 0.286497 ± 0.002130
11
+ Maximum KLD: 27.942949
12
+ 99.9% KLD: 11.911683
13
+ 99.0% KLD: 3.204565
14
+ 99.0% KLD: 3.204565
15
+ Median KLD: 0.139660
16
+ 10.0% KLD: 0.006744
17
+ 5.0% KLD: 0.001423
18
+ 1.0% KLD: 0.000043
19
+ Minimum KLD: -0.000894
20
+
21
+ ====== Token probability statistics ======
22
+ Mean Δp: -0.836 ± 0.032 %
23
+ Maximum Δp: 99.942%
24
+ 99.9% Δp: 76.283%
25
+ 99.0% Δp: 35.932%
26
+ 95.0% Δp: 16.691%
27
+ 90.0% Δp: 9.040%
28
+ 75.0% Δp: 1.129%
29
+ Median Δp: -0.019%
30
+ 25.0% Δp: -2.256%
31
+ 10.0% Δp: -11.765%
32
+ 5.0% Δp: -20.807%
33
+ 1.0% Δp: -43.493%
34
+ 0.1% Δp: -79.148%
35
+ Minimum Δp: -99.887%
36
+ RMS Δp : 12.354 ± 0.058 %
37
+ Same top p: 77.725 ± 0.107 %
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_m.tqa ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ3_M.gguf (version GGUF V3 (latest))
4
+
5
+ Final result: 35.0667 +/- 1.7436
6
+ Random chance: 19.8992 +/- 1.4588
7
+
8
+
9
+ llama_perf_context_print: load time = 759.94 ms
10
+ llama_perf_context_print: prompt eval time = 216548.55 ms / 51053 tokens ( 4.24 ms per token, 235.76 tokens per second)
11
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
12
+ llama_perf_context_print: total time = 218211.64 ms / 51054 tokens
13
+ ggml_metal_free: deallocating
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
21
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_m.wng ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ3_M.gguf (version GGUF V3 (latest))
4
+
5
+ Final Winogrande score(750 tasks): 70.9333 +/- 1.6591
6
+
7
+ llama_perf_context_print: load time = 753.88 ms
8
+ llama_perf_context_print: prompt eval time = 94337.41 ms / 22541 tokens ( 4.19 ms per token, 238.94 tokens per second)
9
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
10
+ llama_perf_context_print: total time = 94843.38 ms / 22542 tokens
11
+ ggml_metal_free: deallocating
12
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
13
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_s.arc ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ3_S.gguf (version GGUF V3 (latest))
4
+
5
+ Final result: 62.2667 +/- 1.7711
6
+ Random chance: 25.0083 +/- 1.5824
7
+
8
+
9
+ llama_perf_context_print: load time = 4272.29 ms
10
+ llama_perf_context_print: prompt eval time = 152740.59 ms / 36666 tokens ( 4.17 ms per token, 240.05 tokens per second)
11
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
12
+ llama_perf_context_print: total time = 153595.41 ms / 36667 tokens
13
+ ggml_metal_free: deallocating
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
21
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_s.hsw ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ3_S.gguf (version GGUF V3 (latest))
4
+
5
+ 750 78.00000000% [74.8968%, 80.8179%]
6
+
7
+
8
+ llama_perf_context_print: load time = 728.94 ms
9
+ llama_perf_context_print: prompt eval time = 537974.27 ms / 129319 tokens ( 4.16 ms per token, 240.38 tokens per second)
10
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
11
+ llama_perf_context_print: total time = 541689.42 ms / 129320 tokens
12
+ ggml_metal_free: deallocating
13
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_s.mmlu ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ3_S.gguf (version GGUF V3 (latest))
4
+
5
+ Final result: 39.4667 +/- 1.7860
6
+ Random chance: 25.0000 +/- 1.5822
7
+
8
+
9
+ llama_perf_context_print: load time = 757.43 ms
10
+ llama_perf_context_print: prompt eval time = 283333.69 ms / 68956 tokens ( 4.11 ms per token, 243.37 tokens per second)
11
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
12
+ llama_perf_context_print: total time = 284662.93 ms / 68957 tokens
13
+ ggml_metal_free: deallocating
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
21
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_s.ppx ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ====== Perplexity statistics ======
2
+ Mean PPL(Q) : 32.512225 ± 0.267969
3
+ Mean PPL(base) : 20.457257 ± 0.161275
4
+ Cor(ln(PPL(Q)), ln(PPL(base))): 91.81%
5
+ Mean ln(PPL(Q)/PPL(base)) : 0.463278 ± 0.003282
6
+ Mean PPL(Q)/PPL(base) : 1.589276 ± 0.005215
7
+ Mean PPL(Q)-PPL(base) : 12.054968 ± 0.135871
8
+
9
+ ====== KL divergence statistics ======
10
+ Mean KLD: 0.484943 ± 0.002308
11
+ Maximum KLD: 23.137739
12
+ 99.9% KLD: 11.445929
13
+ 99.0% KLD: 3.996119
14
+ 99.0% KLD: 3.996119
15
+ Median KLD: 0.279736
16
+ 10.0% KLD: 0.019254
17
+ 5.0% KLD: 0.004941
18
+ 1.0% KLD: 0.000332
19
+ Minimum KLD: -0.000675
20
+
21
+ ====== Token probability statistics ======
22
+ Mean Δp: -4.988 ± 0.042 %
23
+ Maximum Δp: 99.950%
24
+ 99.9% Δp: 75.785%
25
+ 99.0% Δp: 36.459%
26
+ 95.0% Δp: 13.799%
27
+ 90.0% Δp: 5.484%
28
+ 75.0% Δp: 0.086%
29
+ Median Δp: -0.427%
30
+ 25.0% Δp: -7.430%
31
+ 10.0% Δp: -24.702%
32
+ 5.0% Δp: -37.934%
33
+ 1.0% Δp: -63.532%
34
+ 0.1% Δp: -87.413%
35
+ Minimum Δp: -99.733%
36
+ RMS Δp : 17.076 ± 0.062 %
37
+ Same top p: 70.771 ± 0.117 %
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_s.tqa ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ3_S.gguf (version GGUF V3 (latest))
4
+
5
+ Final result: 36.1333 +/- 1.7553
6
+ Random chance: 19.8992 +/- 1.4588
7
+
8
+
9
+ llama_perf_context_print: load time = 755.36 ms
10
+ llama_perf_context_print: prompt eval time = 216405.85 ms / 51053 tokens ( 4.24 ms per token, 235.91 tokens per second)
11
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
12
+ llama_perf_context_print: total time = 217987.98 ms / 51054 tokens
13
+ ggml_metal_free: deallocating
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
21
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_s.wng ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ3_S.gguf (version GGUF V3 (latest))
4
+
5
+ Final Winogrande score(750 tasks): 72.9333 +/- 1.6235
6
+
7
+ llama_perf_context_print: load time = 752.81 ms
8
+ llama_perf_context_print: prompt eval time = 94369.26 ms / 22541 tokens ( 4.19 ms per token, 238.86 tokens per second)
9
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
10
+ llama_perf_context_print: total time = 94874.43 ms / 22542 tokens
11
+ ggml_metal_free: deallocating
12
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
13
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq4_nl.arc ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ4_NL.gguf (version GGUF V3 (latest))
4
+
5
+ Final result: 62.4000 +/- 1.7699
6
+ Random chance: 25.0083 +/- 1.5824
7
+
8
+
9
+ llama_perf_context_print: load time = 5546.40 ms
10
+ llama_perf_context_print: prompt eval time = 157820.59 ms / 36666 tokens ( 4.30 ms per token, 232.33 tokens per second)
11
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
12
+ llama_perf_context_print: total time = 158658.47 ms / 36667 tokens
13
+ ggml_metal_free: deallocating
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
21
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq4_nl.hsw ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ4_NL.gguf (version GGUF V3 (latest))
4
+
5
+ 750 79.46666667% [76.4290%, 82.2040%]
6
+
7
+
8
+ llama_perf_context_print: load time = 856.94 ms
9
+ llama_perf_context_print: prompt eval time = 558886.79 ms / 129319 tokens ( 4.32 ms per token, 231.39 tokens per second)
10
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
11
+ llama_perf_context_print: total time = 562503.08 ms / 129320 tokens
12
+ ggml_metal_free: deallocating
13
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq4_nl.mmlu ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ4_NL.gguf (version GGUF V3 (latest))
4
+
5
+ Final result: 41.4667 +/- 1.8002
6
+ Random chance: 25.0000 +/- 1.5822
7
+
8
+
9
+ llama_perf_context_print: load time = 914.88 ms
10
+ llama_perf_context_print: prompt eval time = 294610.76 ms / 68956 tokens ( 4.27 ms per token, 234.06 tokens per second)
11
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
12
+ llama_perf_context_print: total time = 295846.99 ms / 68957 tokens
13
+ ggml_metal_free: deallocating
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
21
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq4_nl.ppx ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ====== Perplexity statistics ======
2
+ Mean PPL(Q) : 25.450410 ± 0.199570
3
+ Mean PPL(base) : 20.457257 ± 0.161275
4
+ Cor(ln(PPL(Q)), ln(PPL(base))): 96.78%
5
+ Mean ln(PPL(Q)/PPL(base)) : 0.218394 ± 0.001994
6
+ Mean PPL(Q)/PPL(base) : 1.244077 ± 0.002481
7
+ Mean PPL(Q)-PPL(base) : 4.993153 ± 0.059469
8
+
9
+ ====== KL divergence statistics ======
10
+ Mean KLD: 0.173444 ± 0.001426
11
+ Maximum KLD: 22.223246
12
+ 99.9% KLD: 9.213535
13
+ 99.0% KLD: 1.556305
14
+ 99.0% KLD: 1.556305
15
+ Median KLD: 0.081647
16
+ 10.0% KLD: 0.006194
17
+ 5.0% KLD: 0.001743
18
+ 1.0% KLD: 0.000131
19
+ Minimum KLD: -0.000788
20
+
21
+ ====== Token probability statistics ======
22
+ Mean Δp: -3.757 ± 0.027 %
23
+ Maximum Δp: 99.782%
24
+ 99.9% Δp: 44.427%
25
+ 99.0% Δp: 18.257%
26
+ 95.0% Δp: 5.991%
27
+ 90.0% Δp: 2.080%
28
+ 75.0% Δp: 0.027%
29
+ Median Δp: -0.328%
30
+ 25.0% Δp: -5.093%
31
+ 10.0% Δp: -15.863%
32
+ 5.0% Δp: -24.394%
33
+ 1.0% Δp: -43.233%
34
+ 0.1% Δp: -70.697%
35
+ Minimum Δp: -99.732%
36
+ RMS Δp : 11.165 ± 0.051 %
37
+ Same top p: 82.789 ± 0.097 %
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq4_nl.tqa ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ4_NL.gguf (version GGUF V3 (latest))
4
+
5
+ Final result: 37.4667 +/- 1.7686
6
+ Random chance: 19.8992 +/- 1.4588
7
+
8
+
9
+ llama_perf_context_print: load time = 912.73 ms
10
+ llama_perf_context_print: prompt eval time = 225621.49 ms / 51053 tokens ( 4.42 ms per token, 226.28 tokens per second)
11
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
12
+ llama_perf_context_print: total time = 227248.28 ms / 51054 tokens
13
+ ggml_metal_free: deallocating
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
21
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq4_nl.wng ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ4_NL.gguf (version GGUF V3 (latest))
4
+
5
+ Final Winogrande score(750 tasks): 71.3333 +/- 1.6523
6
+
7
+ llama_perf_context_print: load time = 937.74 ms
8
+ llama_perf_context_print: prompt eval time = 98116.26 ms / 22541 tokens ( 4.35 ms per token, 229.74 tokens per second)
9
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
10
+ llama_perf_context_print: total time = 98670.87 ms / 22542 tokens
11
+ ggml_metal_free: deallocating
12
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
13
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_l.arc ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_L.gguf (version GGUF V3 (latest))
4
+
5
+ Final result: 64.0000 +/- 1.7539
6
+ Random chance: 25.0083 +/- 1.5824
7
+
8
+
9
+ llama_perf_context_print: load time = 5136.60 ms
10
+ llama_perf_context_print: prompt eval time = 164152.27 ms / 36666 tokens ( 4.48 ms per token, 223.37 tokens per second)
11
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
12
+ llama_perf_context_print: total time = 164972.95 ms / 36667 tokens
13
+ ggml_metal_free: deallocating
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
21
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_l.hsw ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_L.gguf (version GGUF V3 (latest))
4
+
5
+ 750 77.20000000% [74.0633%, 80.0595%]
6
+
7
+
8
+ llama_perf_context_print: load time = 833.00 ms
9
+ llama_perf_context_print: prompt eval time = 580747.74 ms / 129319 tokens ( 4.49 ms per token, 222.68 tokens per second)
10
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
11
+ llama_perf_context_print: total time = 584554.96 ms / 129320 tokens
12
+ ggml_metal_free: deallocating
13
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_l.mmlu ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_L.gguf (version GGUF V3 (latest))
4
+
5
+ Final result: 41.4667 +/- 1.8002
6
+ Random chance: 25.0000 +/- 1.5822
7
+
8
+
9
+ llama_perf_context_print: load time = 788.56 ms
10
+ llama_perf_context_print: prompt eval time = 305965.89 ms / 68956 tokens ( 4.44 ms per token, 225.37 tokens per second)
11
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
12
+ llama_perf_context_print: total time = 307240.49 ms / 68957 tokens
13
+ ggml_metal_free: deallocating
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
21
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_l.ppx ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ====== Perplexity statistics ======
2
+ Mean PPL(Q) : 24.405174 ± 0.195423
3
+ Mean PPL(base) : 20.457257 ± 0.161275
4
+ Cor(ln(PPL(Q)), ln(PPL(base))): 95.33%
5
+ Mean ln(PPL(Q)/PPL(base)) : 0.176457 ± 0.002431
6
+ Mean PPL(Q)/PPL(base) : 1.192984 ± 0.002901
7
+ Mean PPL(Q)-PPL(base) : 3.947917 ± 0.064108
8
+
9
+ ====== KL divergence statistics ======
10
+ Mean KLD: 0.248285 ± 0.001640
11
+ Maximum KLD: 22.227032
12
+ 99.9% KLD: 9.707264
13
+ 99.0% KLD: 2.304967
14
+ 99.0% KLD: 2.304967
15
+ Median KLD: 0.123584
16
+ 10.0% KLD: 0.007649
17
+ 5.0% KLD: 0.001796
18
+ 1.0% KLD: 0.000084
19
+ Minimum KLD: -0.000635
20
+
21
+ ====== Token probability statistics ======
22
+ Mean Δp: -2.090 ± 0.032 %
23
+ Maximum Δp: 99.189%
24
+ 99.9% Δp: 64.212%
25
+ 99.0% Δp: 31.519%
26
+ 95.0% Δp: 13.797%
27
+ 90.0% Δp: 6.831%
28
+ 75.0% Δp: 0.560%
29
+ Median Δp: -0.057%
30
+ 25.0% Δp: -3.331%
31
+ 10.0% Δp: -14.319%
32
+ 5.0% Δp: -24.226%
33
+ 1.0% Δp: -48.172%
34
+ 0.1% Δp: -79.381%
35
+ Minimum Δp: -99.228%
36
+ RMS Δp : 12.606 ± 0.056 %
37
+ Same top p: 78.472 ± 0.106 %
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_l.tqa ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_L.gguf (version GGUF V3 (latest))
4
+
5
+ Final result: 36.8000 +/- 1.7621
6
+ Random chance: 19.8992 +/- 1.4588
7
+
8
+
9
+ llama_perf_context_print: load time = 841.25 ms
10
+ llama_perf_context_print: prompt eval time = 233752.65 ms / 51053 tokens ( 4.58 ms per token, 218.41 tokens per second)
11
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
12
+ llama_perf_context_print: total time = 235346.12 ms / 51054 tokens
13
+ ggml_metal_free: deallocating
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
21
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_l.wng ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_L.gguf (version GGUF V3 (latest))
4
+
5
+ Final Winogrande score(750 tasks): 72.8000 +/- 1.6260
6
+
7
+ llama_perf_context_print: load time = 832.18 ms
8
+ llama_perf_context_print: prompt eval time = 101421.35 ms / 22541 tokens ( 4.50 ms per token, 222.25 tokens per second)
9
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
10
+ llama_perf_context_print: total time = 101943.45 ms / 22542 tokens
11
+ ggml_metal_free: deallocating
12
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
13
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_m.arc ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_M.gguf (version GGUF V3 (latest))
4
+
5
+ Final result: 64.4000 +/- 1.7496
6
+ Random chance: 25.0083 +/- 1.5824
7
+
8
+
9
+ llama_perf_context_print: load time = 4355.26 ms
10
+ llama_perf_context_print: prompt eval time = 158252.70 ms / 36666 tokens ( 4.32 ms per token, 231.69 tokens per second)
11
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
12
+ llama_perf_context_print: total time = 159087.45 ms / 36667 tokens
13
+ ggml_metal_free: deallocating
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
21
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_m.hsw ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_M.gguf (version GGUF V3 (latest))
4
+
5
+ 750 77.60000000% [74.4798%, 80.4389%]
6
+
7
+
8
+ llama_perf_context_print: load time = 744.01 ms
9
+ llama_perf_context_print: prompt eval time = 561047.56 ms / 129319 tokens ( 4.34 ms per token, 230.50 tokens per second)
10
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
11
+ llama_perf_context_print: total time = 564754.58 ms / 129320 tokens
12
+ ggml_metal_free: deallocating
13
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_m.mmlu ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_M.gguf (version GGUF V3 (latest))
4
+
5
+ Final result: 40.8000 +/- 1.7958
6
+ Random chance: 25.0000 +/- 1.5822
7
+
8
+
9
+ llama_perf_context_print: load time = 759.93 ms
10
+ llama_perf_context_print: prompt eval time = 295554.48 ms / 68956 tokens ( 4.29 ms per token, 233.31 tokens per second)
11
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
12
+ llama_perf_context_print: total time = 296870.54 ms / 68957 tokens
13
+ ggml_metal_free: deallocating
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
21
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_m.ppx ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ====== Perplexity statistics ======
2
+ Mean PPL(Q) : 24.750046 ± 0.198446
3
+ Mean PPL(base) : 20.457257 ± 0.161275
4
+ Cor(ln(PPL(Q)), ln(PPL(base))): 94.92%
5
+ Mean ln(PPL(Q)/PPL(base)) : 0.190490 ± 0.002538
6
+ Mean PPL(Q)/PPL(base) : 1.209842 ± 0.003071
7
+ Mean PPL(Q)-PPL(base) : 4.292789 ± 0.068077
8
+
9
+ ====== KL divergence statistics ======
10
+ Mean KLD: 0.273478 ± 0.001716
11
+ Maximum KLD: 23.756779
12
+ 99.9% KLD: 9.973213
13
+ 99.0% KLD: 2.569659
14
+ 99.0% KLD: 2.569659
15
+ Median KLD: 0.139089
16
+ 10.0% KLD: 0.008610
17
+ 5.0% KLD: 0.002078
18
+ 1.0% KLD: 0.000103
19
+ Minimum KLD: -0.000674
20
+
21
+ ====== Token probability statistics ======
22
+ Mean Δp: -2.189 ± 0.033 %
23
+ Maximum Δp: 99.919%
24
+ 99.9% Δp: 63.247%
25
+ 99.0% Δp: 33.168%
26
+ 95.0% Δp: 14.581%
27
+ 90.0% Δp: 7.284%
28
+ 75.0% Δp: 0.603%
29
+ Median Δp: -0.062%
30
+ 25.0% Δp: -3.479%
31
+ 10.0% Δp: -15.050%
32
+ 5.0% Δp: -25.583%
33
+ 1.0% Δp: -50.625%
34
+ 0.1% Δp: -81.793%
35
+ Minimum Δp: -99.620%
36
+ RMS Δp : 13.182 ± 0.057 %
37
+ Same top p: 77.344 ± 0.107 %
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_m.tqa ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_M.gguf (version GGUF V3 (latest))
4
+
5
+ Final result: 36.9333 +/- 1.7635
6
+ Random chance: 19.8992 +/- 1.4588
7
+
8
+
9
+ llama_perf_context_print: load time = 738.68 ms
10
+ llama_perf_context_print: prompt eval time = 225672.28 ms / 51053 tokens ( 4.42 ms per token, 226.23 tokens per second)
11
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
12
+ llama_perf_context_print: total time = 227200.49 ms / 51054 tokens
13
+ ggml_metal_free: deallocating
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
21
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_m.wng ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_M.gguf (version GGUF V3 (latest))
4
+
5
+ Final Winogrande score(750 tasks): 72.9333 +/- 1.6235
6
+
7
+ llama_perf_context_print: load time = 750.12 ms
8
+ llama_perf_context_print: prompt eval time = 97927.96 ms / 22541 tokens ( 4.34 ms per token, 230.18 tokens per second)
9
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
10
+ llama_perf_context_print: total time = 98426.78 ms / 22542 tokens
11
+ ggml_metal_free: deallocating
12
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
13
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_s.arc ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_S.gguf (version GGUF V3 (latest))
4
+
5
+ Final result: 61.0667 +/- 1.7816
6
+ Random chance: 25.0083 +/- 1.5824
7
+
8
+
9
+ llama_perf_context_print: load time = 4042.36 ms
10
+ llama_perf_context_print: prompt eval time = 158459.88 ms / 36666 tokens ( 4.32 ms per token, 231.39 tokens per second)
11
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
12
+ llama_perf_context_print: total time = 159304.81 ms / 36667 tokens
13
+ ggml_metal_free: deallocating
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
21
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_s.hsw ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_S.gguf (version GGUF V3 (latest))
4
+
5
+ 750 77.86666667% [74.7577%, 80.6916%]
6
+
7
+
8
+ llama_perf_context_print: load time = 704.36 ms
9
+ llama_perf_context_print: prompt eval time = 558138.31 ms / 129319 tokens ( 4.32 ms per token, 231.70 tokens per second)
10
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
11
+ llama_perf_context_print: total time = 561898.16 ms / 129320 tokens
12
+ ggml_metal_free: deallocating
13
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_s.mmlu ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_S.gguf (version GGUF V3 (latest))
4
+
5
+ Final result: 40.9333 +/- 1.7967
6
+ Random chance: 25.0000 +/- 1.5822
7
+
8
+
9
+ llama_perf_context_print: load time = 707.14 ms
10
+ llama_perf_context_print: prompt eval time = 294106.51 ms / 68956 tokens ( 4.27 ms per token, 234.46 tokens per second)
11
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
12
+ llama_perf_context_print: total time = 295434.68 ms / 68957 tokens
13
+ ggml_metal_free: deallocating
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
21
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_s.ppx ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ====== Perplexity statistics ======
2
+ Mean PPL(Q) : 25.982356 ± 0.208609
3
+ Mean PPL(base) : 20.457257 ± 0.161275
4
+ Cor(ln(PPL(Q)), ln(PPL(base))): 92.33%
5
+ Mean ln(PPL(Q)/PPL(base)) : 0.239080 ± 0.003120
6
+ Mean PPL(Q)/PPL(base) : 1.270080 ± 0.003963
7
+ Mean PPL(Q)-PPL(base) : 5.525099 ± 0.086044
8
+
9
+ ====== KL divergence statistics ======
10
+ Mean KLD: 0.410016 ± 0.002211
11
+ Maximum KLD: 26.456850
12
+ 99.9% KLD: 11.191044
13
+ 99.0% KLD: 3.826428
14
+ 99.0% KLD: 3.826428
15
+ Median KLD: 0.221591
16
+ 10.0% KLD: 0.014583
17
+ 5.0% KLD: 0.003518
18
+ 1.0% KLD: 0.000202
19
+ Minimum KLD: -0.000602
20
+
21
+ ====== Token probability statistics ======
22
+ Mean Δp: -2.759 ± 0.040 %
23
+ Maximum Δp: 99.843%
24
+ 99.9% Δp: 77.836%
25
+ 99.0% Δp: 42.001%
26
+ 95.0% Δp: 18.458%
27
+ 90.0% Δp: 8.931%
28
+ 75.0% Δp: 0.665%
29
+ Median Δp: -0.102%
30
+ 25.0% Δp: -4.673%
31
+ 10.0% Δp: -19.427%
32
+ 5.0% Δp: -31.918%
33
+ 1.0% Δp: -59.329%
34
+ 0.1% Δp: -87.266%
35
+ Minimum Δp: -99.664%
36
+ RMS Δp : 15.977 ± 0.062 %
37
+ Same top p: 72.822 ± 0.114 %
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_s.tqa ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_S.gguf (version GGUF V3 (latest))
4
+
5
+ Final result: 38.2667 +/- 1.7759
6
+ Random chance: 19.8992 +/- 1.4588
7
+
8
+
9
+ llama_perf_context_print: load time = 689.04 ms
10
+ llama_perf_context_print: prompt eval time = 225077.84 ms / 51053 tokens ( 4.41 ms per token, 226.82 tokens per second)
11
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
12
+ llama_perf_context_print: total time = 226715.57 ms / 51054 tokens
13
+ ggml_metal_free: deallocating
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
21
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_s.wng ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_S.gguf (version GGUF V3 (latest))
4
+
5
+ Final Winogrande score(750 tasks): 72.0000 +/- 1.6406
6
+
7
+ llama_perf_context_print: load time = 737.04 ms
8
+ llama_perf_context_print: prompt eval time = 98097.25 ms / 22541 tokens ( 4.35 ms per token, 229.78 tokens per second)
9
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
10
+ llama_perf_context_print: total time = 98614.54 ms / 22542 tokens
11
+ ggml_metal_free: deallocating
12
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
13
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_m.arc ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q4_K_M.gguf (version GGUF V3 (latest))
4
+
5
+ Final result: 62.0000 +/- 1.7736
6
+ Random chance: 25.0083 +/- 1.5824
7
+
8
+
9
+ llama_perf_context_print: load time = 5492.30 ms
10
+ llama_perf_context_print: prompt eval time = 164320.56 ms / 36666 tokens ( 4.48 ms per token, 223.14 tokens per second)
11
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
12
+ llama_perf_context_print: total time = 165122.13 ms / 36667 tokens
13
+ ggml_metal_free: deallocating
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
21
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_m.hsw ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q4_K_M.gguf (version GGUF V3 (latest))
4
+
5
+ 750 79.33333333% [76.2895%, 82.0782%]
6
+
7
+
8
+ llama_perf_context_print: load time = 898.60 ms
9
+ llama_perf_context_print: prompt eval time = 582056.25 ms / 129319 tokens ( 4.50 ms per token, 222.18 tokens per second)
10
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
11
+ llama_perf_context_print: total time = 585789.58 ms / 129320 tokens
12
+ ggml_metal_free: deallocating
13
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_m.mmlu ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q4_K_M.gguf (version GGUF V3 (latest))
4
+
5
+ Final result: 41.7333 +/- 1.8018
6
+ Random chance: 25.0000 +/- 1.5822
7
+
8
+
9
+ llama_perf_context_print: load time = 923.69 ms
10
+ llama_perf_context_print: prompt eval time = 306593.84 ms / 68956 tokens ( 4.45 ms per token, 224.91 tokens per second)
11
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
12
+ llama_perf_context_print: total time = 307849.04 ms / 68957 tokens
13
+ ggml_metal_free: deallocating
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
21
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_m.ppx ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ====== Perplexity statistics ======
2
+ Mean PPL(Q) : 21.411111 ± 0.170166
3
+ Mean PPL(base) : 20.457257 ± 0.161275
4
+ Cor(ln(PPL(Q)), ln(PPL(base))): 98.21%
5
+ Mean ln(PPL(Q)/PPL(base)) : 0.045572 ± 0.001499
6
+ Mean PPL(Q)/PPL(base) : 1.046627 ± 0.001569
7
+ Mean PPL(Q)-PPL(base) : 0.953854 ± 0.032590
8
+
9
+ ====== KL divergence statistics ======
10
+ Mean KLD: 0.092796 ± 0.001164
11
+ Maximum KLD: 25.177353
12
+ 99.9% KLD: 8.122986
13
+ 99.0% KLD: 0.871791
14
+ 99.0% KLD: 0.871791
15
+ Median KLD: 0.036392
16
+ 10.0% KLD: 0.001979
17
+ 5.0% KLD: 0.000420
18
+ 1.0% KLD: 0.000007
19
+ Minimum KLD: -0.000817
20
+
21
+ ====== Token probability statistics ======
22
+ Mean Δp: -0.463 ± 0.019 %
23
+ Maximum Δp: 99.023%
24
+ 99.9% Δp: 44.662%
25
+ 99.0% Δp: 19.835%
26
+ 95.0% Δp: 9.041%
27
+ 90.0% Δp: 4.857%
28
+ 75.0% Δp: 0.673%
29
+ Median Δp: -0.004%
30
+ 25.0% Δp: -1.191%
31
+ 10.0% Δp: -6.370%
32
+ 5.0% Δp: -11.362%
33
+ 1.0% Δp: -24.949%
34
+ 0.1% Δp: -50.867%
35
+ Minimum Δp: -99.932%
36
+ RMS Δp : 7.244 ± 0.048 %
37
+ Same top p: 87.132 ± 0.086 %
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_m.tqa ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q4_K_M.gguf (version GGUF V3 (latest))
4
+
5
+ Final result: 39.2000 +/- 1.7838
6
+ Random chance: 19.8992 +/- 1.4588
7
+
8
+
9
+ llama_perf_context_print: load time = 924.45 ms
10
+ llama_perf_context_print: prompt eval time = 234356.12 ms / 51053 tokens ( 4.59 ms per token, 217.84 tokens per second)
11
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
12
+ llama_perf_context_print: total time = 235973.86 ms / 51054 tokens
13
+ ggml_metal_free: deallocating
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
21
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_m.wng ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q4_K_M.gguf (version GGUF V3 (latest))
4
+
5
+ Final Winogrande score(750 tasks): 71.4667 +/- 1.6500
6
+
7
+ llama_perf_context_print: load time = 897.50 ms
8
+ llama_perf_context_print: prompt eval time = 101897.65 ms / 22541 tokens ( 4.52 ms per token, 221.21 tokens per second)
9
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
10
+ llama_perf_context_print: total time = 102441.25 ms / 22542 tokens
11
+ ggml_metal_free: deallocating
12
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
13
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_s.arc ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q4_K_S.gguf (version GGUF V3 (latest))
4
+
5
+ Final result: 62.9333 +/- 1.7648
6
+ Random chance: 25.0083 +/- 1.5824
7
+
8
+
9
+ llama_perf_context_print: load time = 5073.64 ms
10
+ llama_perf_context_print: prompt eval time = 163620.09 ms / 36666 tokens ( 4.46 ms per token, 224.09 tokens per second)
11
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
12
+ llama_perf_context_print: total time = 164444.66 ms / 36667 tokens
13
+ ggml_metal_free: deallocating
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
21
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_s.hsw ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q4_K_S.gguf (version GGUF V3 (latest))
4
+
5
+ 750 78.40000000% [75.3141%, 81.1964%]
6
+
7
+
8
+ llama_perf_context_print: load time = 844.26 ms
9
+ llama_perf_context_print: prompt eval time = 578922.97 ms / 129319 tokens ( 4.48 ms per token, 223.38 tokens per second)
10
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
11
+ llama_perf_context_print: total time = 582667.59 ms / 129320 tokens
12
+ ggml_metal_free: deallocating
13
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_s.mmlu ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
2
+ llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
3
+ llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q4_K_S.gguf (version GGUF V3 (latest))
4
+
5
+ Final result: 40.4000 +/- 1.7930
6
+ Random chance: 25.0000 +/- 1.5822
7
+
8
+
9
+ llama_perf_context_print: load time = 898.56 ms
10
+ llama_perf_context_print: prompt eval time = 305282.54 ms / 68956 tokens ( 4.43 ms per token, 225.88 tokens per second)
11
+ llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
12
+ llama_perf_context_print: total time = 306510.96 ms / 68957 tokens
13
+ ggml_metal_free: deallocating
14
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
15
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
16
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
17
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
18
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
19
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
20
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
21
+ ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)