You Only Cache Once: Decoder-Decoder Architectures for Language Models
Paper
• 2405.05254 • Published
• 10
Additional paper with faq, code and tips on: https://github.com/microsoft/unilm/blob/master/bitnet/The-Era-of-1-bit-LLMs__Training_Tips_Code_FAQ.pdf