What I Read: Reformer efficient Transformer

https://towardsdatascience.com/illustrating-the-reformer-393575ac6ba0?gi=34b920510f6f

Illustrating the Reformer
The efficient Transformer
Alireza Dirafzoon
Feb 4


“Recently, Google introduced the Reformer architecture, a Transformer model designed to efficiently handle processing very long sequences of data (e.g. up to 1 million words in a language processing). Execution of Reformer requires much lower memory consumption and achieves impressive performance even when running on only a single GPU.”