Is it work to quantize an existing LLM into 1.58bit?

#1
by SilverJim - opened

Hello, I wonder Is it work to quantize an existing LLM into 1.58bit?
About 1.58b quantization, I think it loses so much precision to quantize an existing LLM into 1.58bit, it is seems 1.58bit quantization can only get a good result when the LLM is trained as a 1.58bit model from beginning.

I'm currently battling the quality loss and compounding errors issue. no luck yet, but once I sort it I'll update in place and update the readme and model card

Sign up or log in to comment