YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Quantization made by Richard Erkhov.

Request more models

Leia-Swallow-7b - GGUF

Model creator: https://huggingface.co/leia-llm/
Original model: https://huggingface.co/leia-llm/Leia-Swallow-7b/

Name	Quant method	Size
Leia-Swallow-7b.Q2_K.gguf	Q2_K	2.41GB
Leia-Swallow-7b.IQ3_XS.gguf	IQ3_XS	2.66GB
Leia-Swallow-7b.IQ3_S.gguf	IQ3_S	2.8GB
Leia-Swallow-7b.Q3_K_S.gguf	Q3_K_S	2.8GB
Leia-Swallow-7b.IQ3_M.gguf	IQ3_M	2.95GB
Leia-Swallow-7b.Q3_K.gguf	Q3_K	3.13GB
Leia-Swallow-7b.Q3_K_M.gguf	Q3_K_M	3.13GB
Leia-Swallow-7b.Q3_K_L.gguf	Q3_K_L	3.4GB
Leia-Swallow-7b.IQ4_XS.gguf	IQ4_XS	3.45GB
Leia-Swallow-7b.Q4_0.gguf	Q4_0	3.62GB
Leia-Swallow-7b.IQ4_NL.gguf	IQ4_NL	3.64GB
Leia-Swallow-7b.Q4_K_S.gguf	Q4_K_S	3.65GB
Leia-Swallow-7b.Q4_K.gguf	Q4_K	3.86GB
Leia-Swallow-7b.Q4_K_M.gguf	Q4_K_M	3.86GB
Leia-Swallow-7b.Q4_1.gguf	Q4_1	4.01GB
Leia-Swallow-7b.Q5_0.gguf	Q5_0	4.4GB
Leia-Swallow-7b.Q5_K_S.gguf	Q5_K_S	4.4GB
Leia-Swallow-7b.Q5_K.gguf	Q5_K	4.52GB
Leia-Swallow-7b.Q5_K_M.gguf	Q5_K_M	4.52GB
Leia-Swallow-7b.Q5_1.gguf	Q5_1	4.78GB
Leia-Swallow-7b.Q6_K.gguf	Q6_K	5.22GB
Leia-Swallow-7b.Q8_0.gguf	Q8_0	6.76GB

Original model description:

license: apache-2.0 language: - ja

Leia-Swallow-7B

LEIA is a training technique for autoregressive LLMs that effectively improves their performance in languages other than English by enhancing cross-lingual knowledge transfer from English to a target language. This model is constructed by applying LEIA to Swallow, a Japanese-English bilingual LLM based on LLaMA 2. The model achieves enhanced performance on six Japanese question-answering benchmarks, as reported below.

Please refer to our paper or blog post (in Japanese) for further technical details.

Model List

Empirical Results

The model is assessed using the following six question answering benchmarks:

X-CODAH
X-CSQA
JCommonsenseQA
NIILC
JEMHopQA
JAQKET v2

Model	X-CODAH	X-CSQA	JCommonsenseQA	NIILC	JEMHopQA	JAQKET v2
Swallow	42.0	41.0	80.3	59.5	50.8	86.2
LEIA	42.7	42.4	80.6	60.3	54.7	86.5

For further details of this experiment, please refer to our paper.

Contributors

Ikuya Yamada (Studio Ousia, RIKEN)
Ryokan Ri (LY Corporation, SB Intuitions)

Downloads last month: 404

GGUF

Model size

7B params

Architecture

llama

Hardware compatibility

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support