Quantization made by Richard Erkhov.
Leia-Swallow-7b - GGUF
- Model creator: https://huggingface.co/leia-llm/
 - Original model: https://huggingface.co/leia-llm/Leia-Swallow-7b/
 
| Name | Quant method | Size | 
|---|---|---|
| Leia-Swallow-7b.Q2_K.gguf | Q2_K | 2.41GB | 
| Leia-Swallow-7b.IQ3_XS.gguf | IQ3_XS | 2.66GB | 
| Leia-Swallow-7b.IQ3_S.gguf | IQ3_S | 2.8GB | 
| Leia-Swallow-7b.Q3_K_S.gguf | Q3_K_S | 2.8GB | 
| Leia-Swallow-7b.IQ3_M.gguf | IQ3_M | 2.95GB | 
| Leia-Swallow-7b.Q3_K.gguf | Q3_K | 3.13GB | 
| Leia-Swallow-7b.Q3_K_M.gguf | Q3_K_M | 3.13GB | 
| Leia-Swallow-7b.Q3_K_L.gguf | Q3_K_L | 3.4GB | 
| Leia-Swallow-7b.IQ4_XS.gguf | IQ4_XS | 3.45GB | 
| Leia-Swallow-7b.Q4_0.gguf | Q4_0 | 3.62GB | 
| Leia-Swallow-7b.IQ4_NL.gguf | IQ4_NL | 3.64GB | 
| Leia-Swallow-7b.Q4_K_S.gguf | Q4_K_S | 3.65GB | 
| Leia-Swallow-7b.Q4_K.gguf | Q4_K | 3.86GB | 
| Leia-Swallow-7b.Q4_K_M.gguf | Q4_K_M | 3.86GB | 
| Leia-Swallow-7b.Q4_1.gguf | Q4_1 | 4.01GB | 
| Leia-Swallow-7b.Q5_0.gguf | Q5_0 | 4.4GB | 
| Leia-Swallow-7b.Q5_K_S.gguf | Q5_K_S | 4.4GB | 
| Leia-Swallow-7b.Q5_K.gguf | Q5_K | 4.52GB | 
| Leia-Swallow-7b.Q5_K_M.gguf | Q5_K_M | 4.52GB | 
| Leia-Swallow-7b.Q5_1.gguf | Q5_1 | 4.78GB | 
| Leia-Swallow-7b.Q6_K.gguf | Q6_K | 5.22GB | 
| Leia-Swallow-7b.Q8_0.gguf | Q8_0 | 6.76GB | 
Original model description:
license: apache-2.0 language: - ja
Leia-Swallow-7B
LEIA is a training technique for autoregressive LLMs that effectively improves their performance in languages other than English by enhancing cross-lingual knowledge transfer from English to a target language. This model is constructed by applying LEIA to Swallow, a Japanese-English bilingual LLM based on LLaMA 2. The model achieves enhanced performance on six Japanese question-answering benchmarks, as reported below.
Please refer to our paper or blog post (in Japanese) for further technical details.
- LEIA: Facilitating Cross-Lingual Knowledge Transfer in Language Models with Entity-based Data Augmentation (arxiv.org)
 - LEIA: 言語間転移学習でLLMを賢くする新しい方法 (zenn.dev)
 
Model List
Empirical Results
The model is assessed using the following six question answering benchmarks:
- X-CODAH
 - X-CSQA
 - JCommonsenseQA
 - NIILC
 - JEMHopQA
 - JAQKET v2
 
| Model | X-CODAH | X-CSQA | JCommonsenseQA | NIILC | JEMHopQA | JAQKET v2 | 
|---|---|---|---|---|---|---|
| Swallow | 42.0 | 41.0 | 80.3 | 59.5 | 50.8 | 86.2 | 
| LEIA | 42.7 | 42.4 | 80.6 | 60.3 | 54.7 | 86.5 | 
For further details of this experiment, please refer to our paper.
Contributors
- Ikuya Yamada (Studio Ousia, RIKEN)
 - Ryokan Ri (LY Corporation, SB Intuitions)
 
- Downloads last month
 - 404
 
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit