https://probablymarcus.com/blocks/2023/03/28/gp-extrapolation-sometimes-goofy.html Gaussian Processes Extrapolate, Sometimes in Goofy WaysMarcus Lewis3/28/23 4:59 AM “This leads to something that is undesirable in hyperparameter tuning: if ever you discover a sudden drop-off in performance…
What I Read: Differentiable Trees
https://ericmjl.github.io/blog/2023/8/7/journal-club-differentiable-search-of-evolutionary-trees/ Journal Club: Differentiable Search of Evolutionary TreesEric J. Ma2023-08-07 “…how the authors take a non-differentiable problem and turn it into a differentiable problem through interconversion between mathematical data structures.”
What I Read: Giant Steps Can Solve Optimization Faster
https://www.quantamagazine.org/risky-giant-steps-can-solve-optimization-problems-faster-20230811/ Risky Giant Steps Can Solve Optimization Problems FasterAllison ParshallAugust 11, 2023 “New results break with decades of conventional wisdom for the gradient descent algorithm.”
What I Read: Tree-Structured Parzen Estimator
https://towardsdatascience.com/building-a-tree-structured-parzen-estimator-from-scratch-kind-of-20ed31770478 Building a Tree-Structured Parzen Estimator from Scratch (Kind Of)An alternative to traditional hyperparameter tuning methodsColin HorganApr 4 “Although popular, Grid and Random Search methods… are purely trial and error.
What I Read: Optimizing Machine Learning Training Pipelines
https://medium.com/ntropy-network/bag-of-tricks-for-optimizing-machine-learning-training-pipelines-4f8d5cd3d432 Bag of tricks for optimizing machine learning training pipelinesArseny KravchenkoJan 5 “…we are constantly looking for ways to improve the efficiency of our machine learning pipelines, while keeping the
What I Read: Transformers Training
https://www.borealisai.com/research-blogs/tutorial-17-transformers-iii-training/ Tutorial #17: Transformers III Training08/06/2021P. Xu, S. Prince “…we discuss challenges with transformer training dynamics and introduce some of the tricks that practitioners use to get transformers to converge.”