https://deepnote.com/@martin-molnar/Gradient-Pseudo-Swap-F2u_aTbqTOisNYCH_JORwg Martin MolnarGradient Pseudo-Swap “When we have layers in a neural network we want to train with gradient descent, but those layers don’t have smooth gradients that can be used,