https://willwhitney.com/parallel-training-jax.html Will WhitneyParallelizing neural networks on one GPU with JAX How you can get a 100x speedup for training small neural networks by making the most of your accelerator.“Most neural
What I Read: Parallel Bayesian Optimization
https://medium.com/riskified-technology/from-sequential-to-parallel-a-story-about-parallel-bayesian-hyperparameter-optimization-f124ed29f556 From Sequential to Parallel, a story about Bayesian Hyperparameter OptimizationAndres AsaraviciusApr 26 “The goal of this process is to find the best set of parameters, and our goal was
What I Read: Deep Learning Recommendation Models
https://www.kdnuggets.com/2021/04/deep-learning-recommendation-models-dlrm-deep-dive.html Deep Learning Recommendation Models (DLRM): A Deep DiveBy Nishant Kumar, Data Science Professional. “This deep dive article presents the architecture and deployment issues experienced with the deep learning recommendation
What I Read: Attention with Performers
https://ai.googleblog.com/2020/10/rethinking-attention-with-performers.html Rethinking Attention with PerformersFriday, October 23, 2020Posted by Krzysztof Choromanski and Lucy Colwell, Research Scientists, Google Research “To resolve these issues, we introduce the Performer, a Transformer architecture with