OpenMOSS-Team
/

bart-large-chinese

text2text-generation

Model card Files Files and versions

xpqiu commited on Aug 12

Commit

4235d7d

·

verified ·

1 Parent(s): 75cdf21

Update README.md

Files changed (1) hide show

README.md +15 -5

README.md CHANGED Viewed

@@ -61,12 +61,22 @@ Yunfan Shao, Zhichao Geng, Yitao Liu, Junqi Dai, Fei Yang, Li Zhe, Hujun Bao, Xi
 **Note: Please use BertTokenizer for the model vocabulary. DO NOT use original BartTokenizer.**
 ## Citation
 ```bibtex
-@article{shao2021cpt,
-  title={CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Generation},
-  author={Yunfan Shao and Zhichao Geng and Yitao Liu and Junqi Dai and Fei Yang and Li Zhe and Hujun Bao and Xipeng Qiu},
-  journal={arXiv preprint arXiv:2109.05729},
-  year={2021}
 }
 ```

 **Note: Please use BertTokenizer for the model vocabulary. DO NOT use original BartTokenizer.**
 ## Citation
+Shao, Y., Geng, Z., Liu, Y. et al. CPT: a pre-trained unbalanced transformer for both Chinese language understanding and generation. Sci. China Inf. Sci. 67, 152102 (2024).
+https://www.sciengine.com/SCIS/doi/10.1007/s11432-021-3536-5
 ```bibtex
+@Article{Shao2024a,
+  author   = {Shao, Yunfan and Geng, Zhichao and Liu, Yitao and Dai, Junqi and Yan, Hang and Yang, Fei and Li, Zhe and Bao, Hujun and Qiu, Xipeng},
+  journal  = {Science China Information Sciences},
+  title    = {CPT: a pre-trained unbalanced transformer for both Chinese language understanding and generation},
+  year     = {2024},
+  issn     = {1869-1919},
+  number   = {5},
+  pages    = {152102},
+  volume   = {67},
+  abstract = {In this paper, we take the advantage of previous pre-trained models (PTMs) and propose a novel Chinese pre-trained unbalanced transformer (CPT). Different from previous Chinese PTMs, CPT is designed to utilize the shared knowledge between natural language understanding (NLU) and natural language generation (NLG) to boost the performance. CPT consists of three parts: a shared encoder, an understanding decoder, and a generation decoder. Two specific decoders with a shared encoder are pre-trained with masked language modeling (MLM) and denoising auto-encoding (DAE) tasks, respectively. With the partially shared architecture and multi-task pre-training, CPT can (1) learn specific knowledge of both NLU or NLG tasks with two decoders and (2) be fine-tuned flexibly that fully exploits the potential of the model. Moreover, the unbalanced transformer saves the computational and storage cost, which makes CPT competitive and greatly accelerates the inference of text generation. Experimental results on a wide range of Chinese NLU and NLG tasks show the effectiveness of CPT.},
+  doi      = {10.1007/s11432-021-3536-5},
+  refid    = {Shao2024},
+  url      = {https://doi.org/10.1007/s11432-021-3536-5},
 }
 ```