What I Read: 1-bit LLMs, 1.58 Bits

https://arxiv.org/abs/2402.17764

The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Shuming Ma, Hongyu Wang, Lingxiao Ma, Lei Wang, Wenhui Wang, Shaohan Huang, Li Dong, Ruiping Wang, Jilong Xue, Furu Wei
Submitted on 27 Feb 2024


“…we introduce a 1-bit LLM variant, namely BitNet b1.58… It matches the full-precision… Transformer LLM with the same model size and training tokens in terms of both perplexity and end-task performance, while being significantly more cost-effective…”