https://magazine.sebastianraschka.com/p/accelerating-pytorch-model-training Accelerating PyTorch Model TrainingUsing Mixed-Precision and Fully Sharded Data ParallelismSebastian Raschka, PhDJun 26, 2023 “…how to scale PyTorch model training with minimal code changes. The focus here is on
What I Read: Abilities Emerging From AI
https://www.quantamagazine.org/the-unpredictable-abilities-emerging-from-large-ai-models-20230316/ The Unpredictable Abilities Emerging From Large AI Models “Large language models like ChatGPT are now big enough that they’ve started to display startling, unpredictable behaviors.”
What I Read: Geometric Deep Learning
https://thegradient.pub/towards-geometric-deep-learning/ Towards Geometric Deep LearningMichael Bronstein18.Feb.2023 “Geometric Deep Learning is an umbrella term for approaches considering a broad class of ML problems from the perspectives of symmetry and invariance.”
What I Read: Realtime User Actions in Recommendation
https://medium.com/pinterest-engineering/how-pinterest-leverages-realtime-user-actions-in-recommendation-to-boost-homefeed-engagement-volume-165ae2e8cde8 How Pinterest Leverages Realtime User Actions in Recommendation to Boost Homefeed Engagement VolumeXue Xia, Software Engineer, Homefeed Ranking; Neng Gu, Software Engineer, Content & User Understanding; Dhruvil Deven Badani,
What I Read: Transformers Training
https://www.borealisai.com/research-blogs/tutorial-17-transformers-iii-training/ Tutorial #17: Transformers III Training08/06/2021P. Xu, S. Prince “…we discuss challenges with transformer training dynamics and introduce some of the tricks that practitioners use to get transformers to converge.”