https://imbue.com/research/70b-infrastructure From bare metal to a 70B model: infrastructure set-up and scriptsThe Imbue TeamJune 25, 2024 “…we trained a 70B parameter model from scratch on our own infrastructure that outperformed
What I Read: Nvidia, GPU gold rush
https://blog.johnluttig.com/p/nvidia-envy-understanding-the-gpu Nvidia Envy: understanding the GPU gold rushJohn LuttigNov 10, 2023 “In 2023, thousands of companies and countries begged Nvidia to purchase more GPUs. Can the exponential demand endure?”
What I Read: Distributed Training, Finetuning
https://sumanthrh.com/post/distributed-and-efficient-finetuning/ Everything about Distributed Training and Efficient FinetuningSumanth R HegdeLast updated on Oct 13, 2023 “practical guidelines and gotchas with multi-GPU and multi-node training”