Generate Perplexity, KLD, ARC, HellaSwag, MMLU, Truthful QA and WinoGrande scores
Browse filesThis view is limited to 50 files because it contains too many changes.
See raw diff
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-F16.arc +21 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-F16.hsw +20 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-F16.mmlu +21 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-F16.tqa +21 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-F16.wng +19 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_m.arc +21 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_m.hsw +20 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_m.mmlu +21 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_m.ppx +37 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_m.tqa +21 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_m.wng +19 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_s.arc +21 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_s.hsw +20 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_s.mmlu +21 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_s.ppx +37 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_s.tqa +21 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_s.wng +19 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq4_nl.arc +21 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq4_nl.hsw +20 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq4_nl.mmlu +21 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq4_nl.ppx +37 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq4_nl.tqa +21 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq4_nl.wng +19 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_l.arc +21 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_l.hsw +20 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_l.mmlu +21 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_l.ppx +37 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_l.tqa +21 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_l.wng +19 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_m.arc +21 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_m.hsw +20 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_m.mmlu +21 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_m.ppx +37 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_m.tqa +21 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_m.wng +19 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_s.arc +21 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_s.hsw +20 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_s.mmlu +21 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_s.ppx +37 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_s.tqa +21 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_s.wng +19 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_m.arc +21 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_m.hsw +20 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_m.mmlu +21 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_m.ppx +37 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_m.tqa +21 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_m.wng +19 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_s.arc +21 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_s.hsw +20 -0
- scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_s.mmlu +21 -0
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-F16.arc
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 42 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-F16.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final result: 61.7333 +/- 1.7759
|
| 6 |
+
Random chance: 25.0083 +/- 1.5824
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
llama_perf_context_print: load time = 40261.05 ms
|
| 10 |
+
llama_perf_context_print: prompt eval time = 149462.93 ms / 36666 tokens ( 4.08 ms per token, 245.32 tokens per second)
|
| 11 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 12 |
+
llama_perf_context_print: total time = 150294.55 ms / 36667 tokens
|
| 13 |
+
ggml_metal_free: deallocating
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 21 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-F16.hsw
ADDED
|
@@ -0,0 +1,20 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 42 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-F16.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
750 79.73333333% [76.7082%, 82.4554%]
|
| 6 |
+
|
| 7 |
+
|
| 8 |
+
llama_perf_context_print: load time = 2970.89 ms
|
| 9 |
+
llama_perf_context_print: prompt eval time = 528861.23 ms / 129319 tokens ( 4.09 ms per token, 244.52 tokens per second)
|
| 10 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 11 |
+
llama_perf_context_print: total time = 532325.38 ms / 129320 tokens
|
| 12 |
+
ggml_metal_free: deallocating
|
| 13 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-F16.mmlu
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 42 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-F16.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final result: 40.8000 +/- 1.7958
|
| 6 |
+
Random chance: 25.0000 +/- 1.5822
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
llama_perf_context_print: load time = 2768.82 ms
|
| 10 |
+
llama_perf_context_print: prompt eval time = 276273.33 ms / 68956 tokens ( 4.01 ms per token, 249.59 tokens per second)
|
| 11 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 12 |
+
llama_perf_context_print: total time = 277471.13 ms / 68957 tokens
|
| 13 |
+
ggml_metal_free: deallocating
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 21 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-F16.tqa
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 42 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-F16.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final result: 37.4667 +/- 1.7686
|
| 6 |
+
Random chance: 19.8992 +/- 1.4588
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
llama_perf_context_print: load time = 2730.94 ms
|
| 10 |
+
llama_perf_context_print: prompt eval time = 213745.70 ms / 51053 tokens ( 4.19 ms per token, 238.85 tokens per second)
|
| 11 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 12 |
+
llama_perf_context_print: total time = 215254.19 ms / 51054 tokens
|
| 13 |
+
ggml_metal_free: deallocating
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 21 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-F16.wng
ADDED
|
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 42 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-F16.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final Winogrande score(750 tasks): 72.4000 +/- 1.6334
|
| 6 |
+
|
| 7 |
+
llama_perf_context_print: load time = 2841.75 ms
|
| 8 |
+
llama_perf_context_print: prompt eval time = 91265.69 ms / 22541 tokens ( 4.05 ms per token, 246.98 tokens per second)
|
| 9 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 10 |
+
llama_perf_context_print: total time = 91746.92 ms / 22542 tokens
|
| 11 |
+
ggml_metal_free: deallocating
|
| 12 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 13 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_m.arc
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ3_M.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final result: 65.3333 +/- 1.7389
|
| 6 |
+
Random chance: 25.0083 +/- 1.5824
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
llama_perf_context_print: load time = 4590.43 ms
|
| 10 |
+
llama_perf_context_print: prompt eval time = 152258.69 ms / 36666 tokens ( 4.15 ms per token, 240.81 tokens per second)
|
| 11 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 12 |
+
llama_perf_context_print: total time = 153118.42 ms / 36667 tokens
|
| 13 |
+
ggml_metal_free: deallocating
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 21 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_m.hsw
ADDED
|
@@ -0,0 +1,20 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ3_M.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
750 80.00000000% [76.9876%, 82.7066%]
|
| 6 |
+
|
| 7 |
+
|
| 8 |
+
llama_perf_context_print: load time = 769.34 ms
|
| 9 |
+
llama_perf_context_print: prompt eval time = 538472.19 ms / 129319 tokens ( 4.16 ms per token, 240.16 tokens per second)
|
| 10 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 11 |
+
llama_perf_context_print: total time = 542167.13 ms / 129320 tokens
|
| 12 |
+
ggml_metal_free: deallocating
|
| 13 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_m.mmlu
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ3_M.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final result: 39.6000 +/- 1.7870
|
| 6 |
+
Random chance: 25.0000 +/- 1.5822
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
llama_perf_context_print: load time = 806.33 ms
|
| 10 |
+
llama_perf_context_print: prompt eval time = 283405.69 ms / 68956 tokens ( 4.11 ms per token, 243.31 tokens per second)
|
| 11 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 12 |
+
llama_perf_context_print: total time = 284720.03 ms / 68957 tokens
|
| 13 |
+
ggml_metal_free: deallocating
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 21 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_m.ppx
ADDED
|
@@ -0,0 +1,37 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
====== Perplexity statistics ======
|
| 2 |
+
Mean PPL(Q) : 25.645736 ± 0.214748
|
| 3 |
+
Mean PPL(base) : 20.457257 ± 0.161275
|
| 4 |
+
Cor(ln(PPL(Q)), ln(PPL(base))): 95.06%
|
| 5 |
+
Mean ln(PPL(Q)/PPL(base)) : 0.226040 ± 0.002601
|
| 6 |
+
Mean PPL(Q)/PPL(base) : 1.253625 ± 0.003260
|
| 7 |
+
Mean PPL(Q)-PPL(base) : 5.188479 ± 0.079257
|
| 8 |
+
|
| 9 |
+
====== KL divergence statistics ======
|
| 10 |
+
Mean KLD: 0.286497 ± 0.002130
|
| 11 |
+
Maximum KLD: 27.942949
|
| 12 |
+
99.9% KLD: 11.911683
|
| 13 |
+
99.0% KLD: 3.204565
|
| 14 |
+
99.0% KLD: 3.204565
|
| 15 |
+
Median KLD: 0.139660
|
| 16 |
+
10.0% KLD: 0.006744
|
| 17 |
+
5.0% KLD: 0.001423
|
| 18 |
+
1.0% KLD: 0.000043
|
| 19 |
+
Minimum KLD: -0.000894
|
| 20 |
+
|
| 21 |
+
====== Token probability statistics ======
|
| 22 |
+
Mean Δp: -0.836 ± 0.032 %
|
| 23 |
+
Maximum Δp: 99.942%
|
| 24 |
+
99.9% Δp: 76.283%
|
| 25 |
+
99.0% Δp: 35.932%
|
| 26 |
+
95.0% Δp: 16.691%
|
| 27 |
+
90.0% Δp: 9.040%
|
| 28 |
+
75.0% Δp: 1.129%
|
| 29 |
+
Median Δp: -0.019%
|
| 30 |
+
25.0% Δp: -2.256%
|
| 31 |
+
10.0% Δp: -11.765%
|
| 32 |
+
5.0% Δp: -20.807%
|
| 33 |
+
1.0% Δp: -43.493%
|
| 34 |
+
0.1% Δp: -79.148%
|
| 35 |
+
Minimum Δp: -99.887%
|
| 36 |
+
RMS Δp : 12.354 ± 0.058 %
|
| 37 |
+
Same top p: 77.725 ± 0.107 %
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_m.tqa
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ3_M.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final result: 35.0667 +/- 1.7436
|
| 6 |
+
Random chance: 19.8992 +/- 1.4588
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
llama_perf_context_print: load time = 759.94 ms
|
| 10 |
+
llama_perf_context_print: prompt eval time = 216548.55 ms / 51053 tokens ( 4.24 ms per token, 235.76 tokens per second)
|
| 11 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 12 |
+
llama_perf_context_print: total time = 218211.64 ms / 51054 tokens
|
| 13 |
+
ggml_metal_free: deallocating
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 21 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_m.wng
ADDED
|
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ3_M.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final Winogrande score(750 tasks): 70.9333 +/- 1.6591
|
| 6 |
+
|
| 7 |
+
llama_perf_context_print: load time = 753.88 ms
|
| 8 |
+
llama_perf_context_print: prompt eval time = 94337.41 ms / 22541 tokens ( 4.19 ms per token, 238.94 tokens per second)
|
| 9 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 10 |
+
llama_perf_context_print: total time = 94843.38 ms / 22542 tokens
|
| 11 |
+
ggml_metal_free: deallocating
|
| 12 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 13 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_s.arc
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ3_S.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final result: 62.2667 +/- 1.7711
|
| 6 |
+
Random chance: 25.0083 +/- 1.5824
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
llama_perf_context_print: load time = 4272.29 ms
|
| 10 |
+
llama_perf_context_print: prompt eval time = 152740.59 ms / 36666 tokens ( 4.17 ms per token, 240.05 tokens per second)
|
| 11 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 12 |
+
llama_perf_context_print: total time = 153595.41 ms / 36667 tokens
|
| 13 |
+
ggml_metal_free: deallocating
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 21 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_s.hsw
ADDED
|
@@ -0,0 +1,20 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ3_S.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
750 78.00000000% [74.8968%, 80.8179%]
|
| 6 |
+
|
| 7 |
+
|
| 8 |
+
llama_perf_context_print: load time = 728.94 ms
|
| 9 |
+
llama_perf_context_print: prompt eval time = 537974.27 ms / 129319 tokens ( 4.16 ms per token, 240.38 tokens per second)
|
| 10 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 11 |
+
llama_perf_context_print: total time = 541689.42 ms / 129320 tokens
|
| 12 |
+
ggml_metal_free: deallocating
|
| 13 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_s.mmlu
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ3_S.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final result: 39.4667 +/- 1.7860
|
| 6 |
+
Random chance: 25.0000 +/- 1.5822
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
llama_perf_context_print: load time = 757.43 ms
|
| 10 |
+
llama_perf_context_print: prompt eval time = 283333.69 ms / 68956 tokens ( 4.11 ms per token, 243.37 tokens per second)
|
| 11 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 12 |
+
llama_perf_context_print: total time = 284662.93 ms / 68957 tokens
|
| 13 |
+
ggml_metal_free: deallocating
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 21 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_s.ppx
ADDED
|
@@ -0,0 +1,37 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
====== Perplexity statistics ======
|
| 2 |
+
Mean PPL(Q) : 32.512225 ± 0.267969
|
| 3 |
+
Mean PPL(base) : 20.457257 ± 0.161275
|
| 4 |
+
Cor(ln(PPL(Q)), ln(PPL(base))): 91.81%
|
| 5 |
+
Mean ln(PPL(Q)/PPL(base)) : 0.463278 ± 0.003282
|
| 6 |
+
Mean PPL(Q)/PPL(base) : 1.589276 ± 0.005215
|
| 7 |
+
Mean PPL(Q)-PPL(base) : 12.054968 ± 0.135871
|
| 8 |
+
|
| 9 |
+
====== KL divergence statistics ======
|
| 10 |
+
Mean KLD: 0.484943 ± 0.002308
|
| 11 |
+
Maximum KLD: 23.137739
|
| 12 |
+
99.9% KLD: 11.445929
|
| 13 |
+
99.0% KLD: 3.996119
|
| 14 |
+
99.0% KLD: 3.996119
|
| 15 |
+
Median KLD: 0.279736
|
| 16 |
+
10.0% KLD: 0.019254
|
| 17 |
+
5.0% KLD: 0.004941
|
| 18 |
+
1.0% KLD: 0.000332
|
| 19 |
+
Minimum KLD: -0.000675
|
| 20 |
+
|
| 21 |
+
====== Token probability statistics ======
|
| 22 |
+
Mean Δp: -4.988 ± 0.042 %
|
| 23 |
+
Maximum Δp: 99.950%
|
| 24 |
+
99.9% Δp: 75.785%
|
| 25 |
+
99.0% Δp: 36.459%
|
| 26 |
+
95.0% Δp: 13.799%
|
| 27 |
+
90.0% Δp: 5.484%
|
| 28 |
+
75.0% Δp: 0.086%
|
| 29 |
+
Median Δp: -0.427%
|
| 30 |
+
25.0% Δp: -7.430%
|
| 31 |
+
10.0% Δp: -24.702%
|
| 32 |
+
5.0% Δp: -37.934%
|
| 33 |
+
1.0% Δp: -63.532%
|
| 34 |
+
0.1% Δp: -87.413%
|
| 35 |
+
Minimum Δp: -99.733%
|
| 36 |
+
RMS Δp : 17.076 ± 0.062 %
|
| 37 |
+
Same top p: 70.771 ± 0.117 %
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_s.tqa
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ3_S.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final result: 36.1333 +/- 1.7553
|
| 6 |
+
Random chance: 19.8992 +/- 1.4588
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
llama_perf_context_print: load time = 755.36 ms
|
| 10 |
+
llama_perf_context_print: prompt eval time = 216405.85 ms / 51053 tokens ( 4.24 ms per token, 235.91 tokens per second)
|
| 11 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 12 |
+
llama_perf_context_print: total time = 217987.98 ms / 51054 tokens
|
| 13 |
+
ggml_metal_free: deallocating
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 21 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq3_s.wng
ADDED
|
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ3_S.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final Winogrande score(750 tasks): 72.9333 +/- 1.6235
|
| 6 |
+
|
| 7 |
+
llama_perf_context_print: load time = 752.81 ms
|
| 8 |
+
llama_perf_context_print: prompt eval time = 94369.26 ms / 22541 tokens ( 4.19 ms per token, 238.86 tokens per second)
|
| 9 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 10 |
+
llama_perf_context_print: total time = 94874.43 ms / 22542 tokens
|
| 11 |
+
ggml_metal_free: deallocating
|
| 12 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 13 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq4_nl.arc
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ4_NL.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final result: 62.4000 +/- 1.7699
|
| 6 |
+
Random chance: 25.0083 +/- 1.5824
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
llama_perf_context_print: load time = 5546.40 ms
|
| 10 |
+
llama_perf_context_print: prompt eval time = 157820.59 ms / 36666 tokens ( 4.30 ms per token, 232.33 tokens per second)
|
| 11 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 12 |
+
llama_perf_context_print: total time = 158658.47 ms / 36667 tokens
|
| 13 |
+
ggml_metal_free: deallocating
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 21 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq4_nl.hsw
ADDED
|
@@ -0,0 +1,20 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ4_NL.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
750 79.46666667% [76.4290%, 82.2040%]
|
| 6 |
+
|
| 7 |
+
|
| 8 |
+
llama_perf_context_print: load time = 856.94 ms
|
| 9 |
+
llama_perf_context_print: prompt eval time = 558886.79 ms / 129319 tokens ( 4.32 ms per token, 231.39 tokens per second)
|
| 10 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 11 |
+
llama_perf_context_print: total time = 562503.08 ms / 129320 tokens
|
| 12 |
+
ggml_metal_free: deallocating
|
| 13 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq4_nl.mmlu
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ4_NL.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final result: 41.4667 +/- 1.8002
|
| 6 |
+
Random chance: 25.0000 +/- 1.5822
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
llama_perf_context_print: load time = 914.88 ms
|
| 10 |
+
llama_perf_context_print: prompt eval time = 294610.76 ms / 68956 tokens ( 4.27 ms per token, 234.06 tokens per second)
|
| 11 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 12 |
+
llama_perf_context_print: total time = 295846.99 ms / 68957 tokens
|
| 13 |
+
ggml_metal_free: deallocating
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 21 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq4_nl.ppx
ADDED
|
@@ -0,0 +1,37 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
====== Perplexity statistics ======
|
| 2 |
+
Mean PPL(Q) : 25.450410 ± 0.199570
|
| 3 |
+
Mean PPL(base) : 20.457257 ± 0.161275
|
| 4 |
+
Cor(ln(PPL(Q)), ln(PPL(base))): 96.78%
|
| 5 |
+
Mean ln(PPL(Q)/PPL(base)) : 0.218394 ± 0.001994
|
| 6 |
+
Mean PPL(Q)/PPL(base) : 1.244077 ± 0.002481
|
| 7 |
+
Mean PPL(Q)-PPL(base) : 4.993153 ± 0.059469
|
| 8 |
+
|
| 9 |
+
====== KL divergence statistics ======
|
| 10 |
+
Mean KLD: 0.173444 ± 0.001426
|
| 11 |
+
Maximum KLD: 22.223246
|
| 12 |
+
99.9% KLD: 9.213535
|
| 13 |
+
99.0% KLD: 1.556305
|
| 14 |
+
99.0% KLD: 1.556305
|
| 15 |
+
Median KLD: 0.081647
|
| 16 |
+
10.0% KLD: 0.006194
|
| 17 |
+
5.0% KLD: 0.001743
|
| 18 |
+
1.0% KLD: 0.000131
|
| 19 |
+
Minimum KLD: -0.000788
|
| 20 |
+
|
| 21 |
+
====== Token probability statistics ======
|
| 22 |
+
Mean Δp: -3.757 ± 0.027 %
|
| 23 |
+
Maximum Δp: 99.782%
|
| 24 |
+
99.9% Δp: 44.427%
|
| 25 |
+
99.0% Δp: 18.257%
|
| 26 |
+
95.0% Δp: 5.991%
|
| 27 |
+
90.0% Δp: 2.080%
|
| 28 |
+
75.0% Δp: 0.027%
|
| 29 |
+
Median Δp: -0.328%
|
| 30 |
+
25.0% Δp: -5.093%
|
| 31 |
+
10.0% Δp: -15.863%
|
| 32 |
+
5.0% Δp: -24.394%
|
| 33 |
+
1.0% Δp: -43.233%
|
| 34 |
+
0.1% Δp: -70.697%
|
| 35 |
+
Minimum Δp: -99.732%
|
| 36 |
+
RMS Δp : 11.165 ± 0.051 %
|
| 37 |
+
Same top p: 82.789 ± 0.097 %
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq4_nl.tqa
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ4_NL.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final result: 37.4667 +/- 1.7686
|
| 6 |
+
Random chance: 19.8992 +/- 1.4588
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
llama_perf_context_print: load time = 912.73 ms
|
| 10 |
+
llama_perf_context_print: prompt eval time = 225621.49 ms / 51053 tokens ( 4.42 ms per token, 226.28 tokens per second)
|
| 11 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 12 |
+
llama_perf_context_print: total time = 227248.28 ms / 51054 tokens
|
| 13 |
+
ggml_metal_free: deallocating
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 21 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-iq4_nl.wng
ADDED
|
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-IQ4_NL.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final Winogrande score(750 tasks): 71.3333 +/- 1.6523
|
| 6 |
+
|
| 7 |
+
llama_perf_context_print: load time = 937.74 ms
|
| 8 |
+
llama_perf_context_print: prompt eval time = 98116.26 ms / 22541 tokens ( 4.35 ms per token, 229.74 tokens per second)
|
| 9 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 10 |
+
llama_perf_context_print: total time = 98670.87 ms / 22542 tokens
|
| 11 |
+
ggml_metal_free: deallocating
|
| 12 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 13 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_l.arc
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_L.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final result: 64.0000 +/- 1.7539
|
| 6 |
+
Random chance: 25.0083 +/- 1.5824
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
llama_perf_context_print: load time = 5136.60 ms
|
| 10 |
+
llama_perf_context_print: prompt eval time = 164152.27 ms / 36666 tokens ( 4.48 ms per token, 223.37 tokens per second)
|
| 11 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 12 |
+
llama_perf_context_print: total time = 164972.95 ms / 36667 tokens
|
| 13 |
+
ggml_metal_free: deallocating
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 21 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_l.hsw
ADDED
|
@@ -0,0 +1,20 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_L.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
750 77.20000000% [74.0633%, 80.0595%]
|
| 6 |
+
|
| 7 |
+
|
| 8 |
+
llama_perf_context_print: load time = 833.00 ms
|
| 9 |
+
llama_perf_context_print: prompt eval time = 580747.74 ms / 129319 tokens ( 4.49 ms per token, 222.68 tokens per second)
|
| 10 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 11 |
+
llama_perf_context_print: total time = 584554.96 ms / 129320 tokens
|
| 12 |
+
ggml_metal_free: deallocating
|
| 13 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_l.mmlu
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_L.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final result: 41.4667 +/- 1.8002
|
| 6 |
+
Random chance: 25.0000 +/- 1.5822
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
llama_perf_context_print: load time = 788.56 ms
|
| 10 |
+
llama_perf_context_print: prompt eval time = 305965.89 ms / 68956 tokens ( 4.44 ms per token, 225.37 tokens per second)
|
| 11 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 12 |
+
llama_perf_context_print: total time = 307240.49 ms / 68957 tokens
|
| 13 |
+
ggml_metal_free: deallocating
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 21 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_l.ppx
ADDED
|
@@ -0,0 +1,37 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
====== Perplexity statistics ======
|
| 2 |
+
Mean PPL(Q) : 24.405174 ± 0.195423
|
| 3 |
+
Mean PPL(base) : 20.457257 ± 0.161275
|
| 4 |
+
Cor(ln(PPL(Q)), ln(PPL(base))): 95.33%
|
| 5 |
+
Mean ln(PPL(Q)/PPL(base)) : 0.176457 ± 0.002431
|
| 6 |
+
Mean PPL(Q)/PPL(base) : 1.192984 ± 0.002901
|
| 7 |
+
Mean PPL(Q)-PPL(base) : 3.947917 ± 0.064108
|
| 8 |
+
|
| 9 |
+
====== KL divergence statistics ======
|
| 10 |
+
Mean KLD: 0.248285 ± 0.001640
|
| 11 |
+
Maximum KLD: 22.227032
|
| 12 |
+
99.9% KLD: 9.707264
|
| 13 |
+
99.0% KLD: 2.304967
|
| 14 |
+
99.0% KLD: 2.304967
|
| 15 |
+
Median KLD: 0.123584
|
| 16 |
+
10.0% KLD: 0.007649
|
| 17 |
+
5.0% KLD: 0.001796
|
| 18 |
+
1.0% KLD: 0.000084
|
| 19 |
+
Minimum KLD: -0.000635
|
| 20 |
+
|
| 21 |
+
====== Token probability statistics ======
|
| 22 |
+
Mean Δp: -2.090 ± 0.032 %
|
| 23 |
+
Maximum Δp: 99.189%
|
| 24 |
+
99.9% Δp: 64.212%
|
| 25 |
+
99.0% Δp: 31.519%
|
| 26 |
+
95.0% Δp: 13.797%
|
| 27 |
+
90.0% Δp: 6.831%
|
| 28 |
+
75.0% Δp: 0.560%
|
| 29 |
+
Median Δp: -0.057%
|
| 30 |
+
25.0% Δp: -3.331%
|
| 31 |
+
10.0% Δp: -14.319%
|
| 32 |
+
5.0% Δp: -24.226%
|
| 33 |
+
1.0% Δp: -48.172%
|
| 34 |
+
0.1% Δp: -79.381%
|
| 35 |
+
Minimum Δp: -99.228%
|
| 36 |
+
RMS Δp : 12.606 ± 0.056 %
|
| 37 |
+
Same top p: 78.472 ± 0.106 %
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_l.tqa
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_L.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final result: 36.8000 +/- 1.7621
|
| 6 |
+
Random chance: 19.8992 +/- 1.4588
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
llama_perf_context_print: load time = 841.25 ms
|
| 10 |
+
llama_perf_context_print: prompt eval time = 233752.65 ms / 51053 tokens ( 4.58 ms per token, 218.41 tokens per second)
|
| 11 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 12 |
+
llama_perf_context_print: total time = 235346.12 ms / 51054 tokens
|
| 13 |
+
ggml_metal_free: deallocating
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 21 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_l.wng
ADDED
|
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_L.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final Winogrande score(750 tasks): 72.8000 +/- 1.6260
|
| 6 |
+
|
| 7 |
+
llama_perf_context_print: load time = 832.18 ms
|
| 8 |
+
llama_perf_context_print: prompt eval time = 101421.35 ms / 22541 tokens ( 4.50 ms per token, 222.25 tokens per second)
|
| 9 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 10 |
+
llama_perf_context_print: total time = 101943.45 ms / 22542 tokens
|
| 11 |
+
ggml_metal_free: deallocating
|
| 12 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 13 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_m.arc
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_M.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final result: 64.4000 +/- 1.7496
|
| 6 |
+
Random chance: 25.0083 +/- 1.5824
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
llama_perf_context_print: load time = 4355.26 ms
|
| 10 |
+
llama_perf_context_print: prompt eval time = 158252.70 ms / 36666 tokens ( 4.32 ms per token, 231.69 tokens per second)
|
| 11 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 12 |
+
llama_perf_context_print: total time = 159087.45 ms / 36667 tokens
|
| 13 |
+
ggml_metal_free: deallocating
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 21 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_m.hsw
ADDED
|
@@ -0,0 +1,20 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_M.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
750 77.60000000% [74.4798%, 80.4389%]
|
| 6 |
+
|
| 7 |
+
|
| 8 |
+
llama_perf_context_print: load time = 744.01 ms
|
| 9 |
+
llama_perf_context_print: prompt eval time = 561047.56 ms / 129319 tokens ( 4.34 ms per token, 230.50 tokens per second)
|
| 10 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 11 |
+
llama_perf_context_print: total time = 564754.58 ms / 129320 tokens
|
| 12 |
+
ggml_metal_free: deallocating
|
| 13 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_m.mmlu
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_M.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final result: 40.8000 +/- 1.7958
|
| 6 |
+
Random chance: 25.0000 +/- 1.5822
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
llama_perf_context_print: load time = 759.93 ms
|
| 10 |
+
llama_perf_context_print: prompt eval time = 295554.48 ms / 68956 tokens ( 4.29 ms per token, 233.31 tokens per second)
|
| 11 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 12 |
+
llama_perf_context_print: total time = 296870.54 ms / 68957 tokens
|
| 13 |
+
ggml_metal_free: deallocating
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 21 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_m.ppx
ADDED
|
@@ -0,0 +1,37 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
====== Perplexity statistics ======
|
| 2 |
+
Mean PPL(Q) : 24.750046 ± 0.198446
|
| 3 |
+
Mean PPL(base) : 20.457257 ± 0.161275
|
| 4 |
+
Cor(ln(PPL(Q)), ln(PPL(base))): 94.92%
|
| 5 |
+
Mean ln(PPL(Q)/PPL(base)) : 0.190490 ± 0.002538
|
| 6 |
+
Mean PPL(Q)/PPL(base) : 1.209842 ± 0.003071
|
| 7 |
+
Mean PPL(Q)-PPL(base) : 4.292789 ± 0.068077
|
| 8 |
+
|
| 9 |
+
====== KL divergence statistics ======
|
| 10 |
+
Mean KLD: 0.273478 ± 0.001716
|
| 11 |
+
Maximum KLD: 23.756779
|
| 12 |
+
99.9% KLD: 9.973213
|
| 13 |
+
99.0% KLD: 2.569659
|
| 14 |
+
99.0% KLD: 2.569659
|
| 15 |
+
Median KLD: 0.139089
|
| 16 |
+
10.0% KLD: 0.008610
|
| 17 |
+
5.0% KLD: 0.002078
|
| 18 |
+
1.0% KLD: 0.000103
|
| 19 |
+
Minimum KLD: -0.000674
|
| 20 |
+
|
| 21 |
+
====== Token probability statistics ======
|
| 22 |
+
Mean Δp: -2.189 ± 0.033 %
|
| 23 |
+
Maximum Δp: 99.919%
|
| 24 |
+
99.9% Δp: 63.247%
|
| 25 |
+
99.0% Δp: 33.168%
|
| 26 |
+
95.0% Δp: 14.581%
|
| 27 |
+
90.0% Δp: 7.284%
|
| 28 |
+
75.0% Δp: 0.603%
|
| 29 |
+
Median Δp: -0.062%
|
| 30 |
+
25.0% Δp: -3.479%
|
| 31 |
+
10.0% Δp: -15.050%
|
| 32 |
+
5.0% Δp: -25.583%
|
| 33 |
+
1.0% Δp: -50.625%
|
| 34 |
+
0.1% Δp: -81.793%
|
| 35 |
+
Minimum Δp: -99.620%
|
| 36 |
+
RMS Δp : 13.182 ± 0.057 %
|
| 37 |
+
Same top p: 77.344 ± 0.107 %
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_m.tqa
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_M.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final result: 36.9333 +/- 1.7635
|
| 6 |
+
Random chance: 19.8992 +/- 1.4588
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
llama_perf_context_print: load time = 738.68 ms
|
| 10 |
+
llama_perf_context_print: prompt eval time = 225672.28 ms / 51053 tokens ( 4.42 ms per token, 226.23 tokens per second)
|
| 11 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 12 |
+
llama_perf_context_print: total time = 227200.49 ms / 51054 tokens
|
| 13 |
+
ggml_metal_free: deallocating
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 21 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_m.wng
ADDED
|
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_M.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final Winogrande score(750 tasks): 72.9333 +/- 1.6235
|
| 6 |
+
|
| 7 |
+
llama_perf_context_print: load time = 750.12 ms
|
| 8 |
+
llama_perf_context_print: prompt eval time = 97927.96 ms / 22541 tokens ( 4.34 ms per token, 230.18 tokens per second)
|
| 9 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 10 |
+
llama_perf_context_print: total time = 98426.78 ms / 22542 tokens
|
| 11 |
+
ggml_metal_free: deallocating
|
| 12 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 13 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_s.arc
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_S.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final result: 61.0667 +/- 1.7816
|
| 6 |
+
Random chance: 25.0083 +/- 1.5824
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
llama_perf_context_print: load time = 4042.36 ms
|
| 10 |
+
llama_perf_context_print: prompt eval time = 158459.88 ms / 36666 tokens ( 4.32 ms per token, 231.39 tokens per second)
|
| 11 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 12 |
+
llama_perf_context_print: total time = 159304.81 ms / 36667 tokens
|
| 13 |
+
ggml_metal_free: deallocating
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 21 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_s.hsw
ADDED
|
@@ -0,0 +1,20 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_S.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
750 77.86666667% [74.7577%, 80.6916%]
|
| 6 |
+
|
| 7 |
+
|
| 8 |
+
llama_perf_context_print: load time = 704.36 ms
|
| 9 |
+
llama_perf_context_print: prompt eval time = 558138.31 ms / 129319 tokens ( 4.32 ms per token, 231.70 tokens per second)
|
| 10 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 11 |
+
llama_perf_context_print: total time = 561898.16 ms / 129320 tokens
|
| 12 |
+
ggml_metal_free: deallocating
|
| 13 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_s.mmlu
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_S.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final result: 40.9333 +/- 1.7967
|
| 6 |
+
Random chance: 25.0000 +/- 1.5822
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
llama_perf_context_print: load time = 707.14 ms
|
| 10 |
+
llama_perf_context_print: prompt eval time = 294106.51 ms / 68956 tokens ( 4.27 ms per token, 234.46 tokens per second)
|
| 11 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 12 |
+
llama_perf_context_print: total time = 295434.68 ms / 68957 tokens
|
| 13 |
+
ggml_metal_free: deallocating
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 21 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_s.ppx
ADDED
|
@@ -0,0 +1,37 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
====== Perplexity statistics ======
|
| 2 |
+
Mean PPL(Q) : 25.982356 ± 0.208609
|
| 3 |
+
Mean PPL(base) : 20.457257 ± 0.161275
|
| 4 |
+
Cor(ln(PPL(Q)), ln(PPL(base))): 92.33%
|
| 5 |
+
Mean ln(PPL(Q)/PPL(base)) : 0.239080 ± 0.003120
|
| 6 |
+
Mean PPL(Q)/PPL(base) : 1.270080 ± 0.003963
|
| 7 |
+
Mean PPL(Q)-PPL(base) : 5.525099 ± 0.086044
|
| 8 |
+
|
| 9 |
+
====== KL divergence statistics ======
|
| 10 |
+
Mean KLD: 0.410016 ± 0.002211
|
| 11 |
+
Maximum KLD: 26.456850
|
| 12 |
+
99.9% KLD: 11.191044
|
| 13 |
+
99.0% KLD: 3.826428
|
| 14 |
+
99.0% KLD: 3.826428
|
| 15 |
+
Median KLD: 0.221591
|
| 16 |
+
10.0% KLD: 0.014583
|
| 17 |
+
5.0% KLD: 0.003518
|
| 18 |
+
1.0% KLD: 0.000202
|
| 19 |
+
Minimum KLD: -0.000602
|
| 20 |
+
|
| 21 |
+
====== Token probability statistics ======
|
| 22 |
+
Mean Δp: -2.759 ± 0.040 %
|
| 23 |
+
Maximum Δp: 99.843%
|
| 24 |
+
99.9% Δp: 77.836%
|
| 25 |
+
99.0% Δp: 42.001%
|
| 26 |
+
95.0% Δp: 18.458%
|
| 27 |
+
90.0% Δp: 8.931%
|
| 28 |
+
75.0% Δp: 0.665%
|
| 29 |
+
Median Δp: -0.102%
|
| 30 |
+
25.0% Δp: -4.673%
|
| 31 |
+
10.0% Δp: -19.427%
|
| 32 |
+
5.0% Δp: -31.918%
|
| 33 |
+
1.0% Δp: -59.329%
|
| 34 |
+
0.1% Δp: -87.266%
|
| 35 |
+
Minimum Δp: -99.664%
|
| 36 |
+
RMS Δp : 15.977 ± 0.062 %
|
| 37 |
+
Same top p: 72.822 ± 0.114 %
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_s.tqa
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_S.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final result: 38.2667 +/- 1.7759
|
| 6 |
+
Random chance: 19.8992 +/- 1.4588
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
llama_perf_context_print: load time = 689.04 ms
|
| 10 |
+
llama_perf_context_print: prompt eval time = 225077.84 ms / 51053 tokens ( 4.41 ms per token, 226.82 tokens per second)
|
| 11 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 12 |
+
llama_perf_context_print: total time = 226715.57 ms / 51054 tokens
|
| 13 |
+
ggml_metal_free: deallocating
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 21 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q3_k_s.wng
ADDED
|
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q3_K_S.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final Winogrande score(750 tasks): 72.0000 +/- 1.6406
|
| 6 |
+
|
| 7 |
+
llama_perf_context_print: load time = 737.04 ms
|
| 8 |
+
llama_perf_context_print: prompt eval time = 98097.25 ms / 22541 tokens ( 4.35 ms per token, 229.78 tokens per second)
|
| 9 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 10 |
+
llama_perf_context_print: total time = 98614.54 ms / 22542 tokens
|
| 11 |
+
ggml_metal_free: deallocating
|
| 12 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 13 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_m.arc
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q4_K_M.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final result: 62.0000 +/- 1.7736
|
| 6 |
+
Random chance: 25.0083 +/- 1.5824
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
llama_perf_context_print: load time = 5492.30 ms
|
| 10 |
+
llama_perf_context_print: prompt eval time = 164320.56 ms / 36666 tokens ( 4.48 ms per token, 223.14 tokens per second)
|
| 11 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 12 |
+
llama_perf_context_print: total time = 165122.13 ms / 36667 tokens
|
| 13 |
+
ggml_metal_free: deallocating
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 21 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_m.hsw
ADDED
|
@@ -0,0 +1,20 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q4_K_M.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
750 79.33333333% [76.2895%, 82.0782%]
|
| 6 |
+
|
| 7 |
+
|
| 8 |
+
llama_perf_context_print: load time = 898.60 ms
|
| 9 |
+
llama_perf_context_print: prompt eval time = 582056.25 ms / 129319 tokens ( 4.50 ms per token, 222.18 tokens per second)
|
| 10 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 11 |
+
llama_perf_context_print: total time = 585789.58 ms / 129320 tokens
|
| 12 |
+
ggml_metal_free: deallocating
|
| 13 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_m.mmlu
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q4_K_M.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final result: 41.7333 +/- 1.8018
|
| 6 |
+
Random chance: 25.0000 +/- 1.5822
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
llama_perf_context_print: load time = 923.69 ms
|
| 10 |
+
llama_perf_context_print: prompt eval time = 306593.84 ms / 68956 tokens ( 4.45 ms per token, 224.91 tokens per second)
|
| 11 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 12 |
+
llama_perf_context_print: total time = 307849.04 ms / 68957 tokens
|
| 13 |
+
ggml_metal_free: deallocating
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 21 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_m.ppx
ADDED
|
@@ -0,0 +1,37 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
====== Perplexity statistics ======
|
| 2 |
+
Mean PPL(Q) : 21.411111 ± 0.170166
|
| 3 |
+
Mean PPL(base) : 20.457257 ± 0.161275
|
| 4 |
+
Cor(ln(PPL(Q)), ln(PPL(base))): 98.21%
|
| 5 |
+
Mean ln(PPL(Q)/PPL(base)) : 0.045572 ± 0.001499
|
| 6 |
+
Mean PPL(Q)/PPL(base) : 1.046627 ± 0.001569
|
| 7 |
+
Mean PPL(Q)-PPL(base) : 0.953854 ± 0.032590
|
| 8 |
+
|
| 9 |
+
====== KL divergence statistics ======
|
| 10 |
+
Mean KLD: 0.092796 ± 0.001164
|
| 11 |
+
Maximum KLD: 25.177353
|
| 12 |
+
99.9% KLD: 8.122986
|
| 13 |
+
99.0% KLD: 0.871791
|
| 14 |
+
99.0% KLD: 0.871791
|
| 15 |
+
Median KLD: 0.036392
|
| 16 |
+
10.0% KLD: 0.001979
|
| 17 |
+
5.0% KLD: 0.000420
|
| 18 |
+
1.0% KLD: 0.000007
|
| 19 |
+
Minimum KLD: -0.000817
|
| 20 |
+
|
| 21 |
+
====== Token probability statistics ======
|
| 22 |
+
Mean Δp: -0.463 ± 0.019 %
|
| 23 |
+
Maximum Δp: 99.023%
|
| 24 |
+
99.9% Δp: 44.662%
|
| 25 |
+
99.0% Δp: 19.835%
|
| 26 |
+
95.0% Δp: 9.041%
|
| 27 |
+
90.0% Δp: 4.857%
|
| 28 |
+
75.0% Δp: 0.673%
|
| 29 |
+
Median Δp: -0.004%
|
| 30 |
+
25.0% Δp: -1.191%
|
| 31 |
+
10.0% Δp: -6.370%
|
| 32 |
+
5.0% Δp: -11.362%
|
| 33 |
+
1.0% Δp: -24.949%
|
| 34 |
+
0.1% Δp: -50.867%
|
| 35 |
+
Minimum Δp: -99.932%
|
| 36 |
+
RMS Δp : 7.244 ± 0.048 %
|
| 37 |
+
Same top p: 87.132 ± 0.086 %
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_m.tqa
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q4_K_M.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final result: 39.2000 +/- 1.7838
|
| 6 |
+
Random chance: 19.8992 +/- 1.4588
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
llama_perf_context_print: load time = 924.45 ms
|
| 10 |
+
llama_perf_context_print: prompt eval time = 234356.12 ms / 51053 tokens ( 4.59 ms per token, 217.84 tokens per second)
|
| 11 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 12 |
+
llama_perf_context_print: total time = 235973.86 ms / 51054 tokens
|
| 13 |
+
ggml_metal_free: deallocating
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 21 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_m.wng
ADDED
|
@@ -0,0 +1,19 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q4_K_M.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final Winogrande score(750 tasks): 71.4667 +/- 1.6500
|
| 6 |
+
|
| 7 |
+
llama_perf_context_print: load time = 897.50 ms
|
| 8 |
+
llama_perf_context_print: prompt eval time = 101897.65 ms / 22541 tokens ( 4.52 ms per token, 221.21 tokens per second)
|
| 9 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 10 |
+
llama_perf_context_print: total time = 102441.25 ms / 22542 tokens
|
| 11 |
+
ggml_metal_free: deallocating
|
| 12 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 13 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_s.arc
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q4_K_S.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final result: 62.9333 +/- 1.7648
|
| 6 |
+
Random chance: 25.0083 +/- 1.5824
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
llama_perf_context_print: load time = 5073.64 ms
|
| 10 |
+
llama_perf_context_print: prompt eval time = 163620.09 ms / 36666 tokens ( 4.46 ms per token, 224.09 tokens per second)
|
| 11 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 12 |
+
llama_perf_context_print: total time = 164444.66 ms / 36667 tokens
|
| 13 |
+
ggml_metal_free: deallocating
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 21 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_s.hsw
ADDED
|
@@ -0,0 +1,20 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q4_K_S.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
750 78.40000000% [75.3141%, 81.1964%]
|
| 6 |
+
|
| 7 |
+
|
| 8 |
+
llama_perf_context_print: load time = 844.26 ms
|
| 9 |
+
llama_perf_context_print: prompt eval time = 578922.97 ms / 129319 tokens ( 4.48 ms per token, 223.38 tokens per second)
|
| 10 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 11 |
+
llama_perf_context_print: total time = 582667.59 ms / 129320 tokens
|
| 12 |
+
ggml_metal_free: deallocating
|
| 13 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
scores/Mistral-Small-3.2-24B-Instruct-2506-pruned-q4_k_s.mmlu
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
build: 5890 (982e3472) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
|
| 2 |
+
llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
|
| 3 |
+
llama_model_loader: loaded meta data with 46 key-value pairs and 345 tensors from ./Mistral-Small-3.2-24B-Instruct-pruned-Q4_K_S.gguf (version GGUF V3 (latest))
|
| 4 |
+
|
| 5 |
+
Final result: 40.4000 +/- 1.7930
|
| 6 |
+
Random chance: 25.0000 +/- 1.5822
|
| 7 |
+
|
| 8 |
+
|
| 9 |
+
llama_perf_context_print: load time = 898.56 ms
|
| 10 |
+
llama_perf_context_print: prompt eval time = 305282.54 ms / 68956 tokens ( 4.43 ms per token, 225.88 tokens per second)
|
| 11 |
+
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
|
| 12 |
+
llama_perf_context_print: total time = 306510.96 ms / 68957 tokens
|
| 13 |
+
ggml_metal_free: deallocating
|
| 14 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 15 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 16 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 17 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 18 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 19 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 20 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|
| 21 |
+
ggml_metal_mem_pool_free: freeing memory pool, num heaps = 0 (total = 0)
|