https://huggingface.co/blog/bloom-megatron-deepspeed
The Technology Behind BLOOM Training
Stas Bekman
Published July 14, 2022.
“…training ever larger language models has become the norm… the hidden knowledge about how to train such models rarely gets any attention. This article aims to change this by shedding some light on the technology and engineering behind training such models…”