EXL3 models
Collection
33 items
•
Updated
•
38
EXL3 quants of ERNIE-4.5-300B-A47B-PT
2.00 bits per weight
2.10 bits per weight (optimized)
2.25 bits per weight (optimized)
2.50 bits per weight (optimized)
3.00 bits per weight
3.25 bits per weight (optimized)
4.00 bits per weight
| Quant | Weights/VRAM³ | Perplexity | KL-div | MMLU |
|---|---|---|---|---|
| 2.00 bpw | 70.2 GB | 7.4131 | 0.5283 | |
| 2.10 bpw | 73.4 GB | 6.7507 | 0.2202 | 83.40% ±1.13%¹ |
| 2.25 bpw | 78.6 GB | 6.5576 | 0.2074 | 83.70% ±1.13%¹ |
| 2.50 bpw | 87.8 GB | 6.3504 | 0.1899 | 83.96% |
| 3.00 bpw | 104.9 GB | 5.8913 | 0.1547 | 84.61% |
| 3.25 bpw | 113.3 GB | 5.8941 | 0.0806 | 86.80% ±1.03%¹ |
| 4.00 bpw | 139.5 GB | 5.8132 | 0.0717 | 86.50% ±1.04%¹ |
| 2.50 bpw⁴ CCQ | 87.4 GB | 82.58%² | ||
| 4.30 bpw⁴ CCQ | 147.3 GB | 86.16%² | ||
| 8.13 bpw⁴ CCQ | 279.3 GB | 86.50%² | ||
| Original | 597.1 GB | 5.4131 | 86.50%² |
¹ 1000 random samples, 95% CI
² From CCQ paper
³ Size of .safetensors files excluding embedding layer
⁴ Average from CCQ layer mix
Base model
baidu/ERNIE-4.5-300B-A47B-PT