https://twiecki.io/blog/2024/01/04/ An Intuitive Guide to Self-Attention in GPT: The Venetian MasqueradeThomas WieckiJanuary 4, 2024 “In AI, especially with something as intricate as self-attention, it’s easy to get lost in the
What I Read: Will Scaling Solve Robotics?
https://nishanthjkumar.com/Will-Scaling-Solve-Robotics-Perspectives-from-CoRL-2023/ Will Scaling Solve Robotics?: Perspectives From Corl 2023Nishanth J. Kumar “…is training a large neural network on a very large dataset a feasible way to solve robotics?”
What I Read: Distributed Training, Finetuning
https://sumanthrh.com/post/distributed-and-efficient-finetuning/ Everything about Distributed Training and Efficient FinetuningSumanth R HegdeLast updated on Oct 13, 2023 “practical guidelines and gotchas with multi-GPU and multi-node training”