https://deepnote.com/@martin-molnar/Gradient-Pseudo-Swap-F2u_aTbqTOisNYCH_JORwg
Martin Molnar
Gradient Pseudo-Swap
“When we have layers in a neural network we want to train with gradient descent, but those layers don’t have smooth gradients that can be used, we can employ a “gradient pseudo-swap”…”