https://www.quantamagazine.org/risky-giant-steps-can-solve-optimization-problems-faster-20230811/ Risky Giant Steps Can Solve Optimization Problems FasterAllison ParshallAugust 11, 2023 “New results break with decades of conventional wisdom for the gradient descent algorithm.”
What I Read: Deep Learning Optimization Theory
https://towardsdatascience.com/deep-learning-optimization-theory-introduction-148b3504b20f?gi=ff0bd10cc9fe Deep Learning Optimization Theory — IntroductionUnderstanding the theory of optimization in deep learning is crucial to enable progress. This post introduces the experimental and theoretical approaches to studying it.Omri
What I Read: First-Principles Theory of Neural Network Generalization
https://natluk.net/a-first-principles-theory-of-neuralnetwork-generalization-the-berkeley-artificial-intelligence-research-blog/ A First-Principles Theory of Neural Network Generalization – The Berkeley Artificial Intelligence Research BlogNatLuk Community25 October 2021 “Perhaps the greatest of these mysteries has been the question of generalization:
What I Read: Computer Scientists Discover Limits of Major Research Algorithm
https://www.quantamagazine.org/computer-scientists-discover-limits-of-major-research-algorithm-20210817/ Computer Scientists Discover Limits of Major Research AlgorithmNick ThiemeAugust 17, 2021 “Many aspects of modern applied research rely on a crucial algorithm called gradient descent…. researchers have never fully
What I Read: Why Deep Learning Works
https://moultano.wordpress.com/2020/10/18/why-deep-learning-works-even-though-it-shouldnt/ Why Deep Learning Works Even Though It Shouldn’tRyan Moulton’s ArticlesRyan Moulton “Stop talking about minima…. Nobody ever trains their model remotely close to convergence…. What really needs further research