JPharmatron-7B-base

JPharmatron-7B-base is a 7B large language model designed for pharmaceutical applications and researches.

Model Details

Model Description

The JPharmatron-7B-base is continually pre-trained using 2B tokens from Japanese datasets, based on Qwen2.5-7B.

Developed by: EQUES Inc.
Funded by [optional]: GENIAC Project
Model type: Causal decoder-only
Language(s) (NLP): Japanese, English
License: CC-BY-SA-4.0

Model Sources [optional]

Repository: https://github.com/EQUES-Inc/pharma-LLM-eval
Paper [optional]: A Japanese Language Model and Three New Evaluation Benchmarks for Pharmaceutical NLP (IJCNLP-AACL 2025)

Uses

This model has not undergone any post-training including instruction fine-tuning. Therefore, direct use of this model for downstream tasks is not recommended. Also, it is not validated for medical use or any other risk-sensitive use.

Citation [optional]

This paper has been accepted to IJCNLP-AACL 2025. We will update the bibtex info below soon.

BibTeX:

@misc{ono2025japaneselanguagemodelnew,
      title={A Japanese Language Model and Three New Evaluation Benchmarks for Pharmaceutical NLP}, 
      author={Shinnosuke Ono and Issey Sukeda and Takuro Fujii and Kosei Buma and Shunsuke Sasaki},
      year={2025},
      eprint={2505.16661},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2505.16661}, 
}