What I Read: Parallelizing neural networks, GPU, JAX

https://willwhitney.com/parallel-training-jax.html

Will Whitney
Parallelizing neural networks on one GPU with JAX

How you can get a 100x speedup for training small neural networks by making the most of your accelerator.
“Most neural network libraries these days give amazing computational performance for training large neural networks. But small networks… leave a lot of available compute unused…. I describe how to get your money’s worth by training dozens of networks at once.”