Update README.md
Browse files
README.md
CHANGED
|
@@ -12,6 +12,9 @@ language:
|
|
| 12 |
tags:
|
| 13 |
- qwen
|
| 14 |
inference: false
|
|
|
|
|
|
|
|
|
|
| 15 |
---
|
| 16 |
|
| 17 |
# `rinna/nekomata-14b`
|
|
@@ -48,7 +51,7 @@ The name `nekomata` comes from the Japanese word [`猫又/ねこまた/Nekomata`
|
|
| 48 |
|
| 49 |
`nekomata-14B` was trained on 16 nodes of Amazon EC2 trn1.32xlarge instance powered by AWS Trainium purpose-built ML accelerator chip. The pre-training job was completed within a timeframe of approximately 7 days.
|
| 50 |
|
| 51 |
-
* **
|
| 52 |
|
| 53 |
- [Tianyu Zhao](https://huggingface.co/tianyuz)
|
| 54 |
- [Akio Kaga](https://huggingface.co/rakaga)
|
|
@@ -118,10 +121,19 @@ We compared the `Qwen` tokenizer (as used in `nekomata`) and the `llama-2` token
|
|
| 118 |
|
| 119 |
# How to cite
|
| 120 |
~~~
|
| 121 |
-
@misc{
|
| 122 |
-
|
| 123 |
-
title={rinna/nekomata-14b},
|
| 124 |
author={Zhao, Tianyu and Kaga, Akio and Sawada, Kei}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 125 |
}
|
| 126 |
~~~
|
| 127 |
---
|
|
|
|
| 12 |
tags:
|
| 13 |
- qwen
|
| 14 |
inference: false
|
| 15 |
+
license: other
|
| 16 |
+
license_name: tongyi-qianwen-license-agreement
|
| 17 |
+
license_link: https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20LICENSE%20AGREEMENT
|
| 18 |
---
|
| 19 |
|
| 20 |
# `rinna/nekomata-14b`
|
|
|
|
| 51 |
|
| 52 |
`nekomata-14B` was trained on 16 nodes of Amazon EC2 trn1.32xlarge instance powered by AWS Trainium purpose-built ML accelerator chip. The pre-training job was completed within a timeframe of approximately 7 days.
|
| 53 |
|
| 54 |
+
* **Contributors**
|
| 55 |
|
| 56 |
- [Tianyu Zhao](https://huggingface.co/tianyuz)
|
| 57 |
- [Akio Kaga](https://huggingface.co/rakaga)
|
|
|
|
| 121 |
|
| 122 |
# How to cite
|
| 123 |
~~~
|
| 124 |
+
@misc{rinna-nekomata-14b,
|
| 125 |
+
title = {rinna/nekomata-14b},
|
|
|
|
| 126 |
author={Zhao, Tianyu and Kaga, Akio and Sawada, Kei}
|
| 127 |
+
url = {https://huggingface.co/rinna/nekomata-14b},
|
| 128 |
+
}
|
| 129 |
+
|
| 130 |
+
@inproceedings{sawada2024release,
|
| 131 |
+
title = {Release of Pre-Trained Models for the {J}apanese Language},
|
| 132 |
+
author = {Sawada, Kei and Zhao, Tianyu and Shing, Makoto and Mitsui, Kentaro and Kaga, Akio and Hono, Yukiya and Wakatsuki, Toshiaki and Mitsuda, Koh},
|
| 133 |
+
booktitle = {Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)},
|
| 134 |
+
month = {5},
|
| 135 |
+
year = {2024},
|
| 136 |
+
url = {https://arxiv.org/abs/2404.01657},
|
| 137 |
}
|
| 138 |
~~~
|
| 139 |
---
|