https://yugeten.github.io/posts/2025/01/ppogrpo A vision researcher’s guide to some RL stuff: PPO & GRPOYuge (Jimmy) ShiJanuary 31, 2025 “This is a deep dive into Proximal Policy Optimization (PPO), which is one ofContinue readingWhat I Read: RL, PPO, GRPO
https://elevanth.org/blog/2017/11/28/build-a-better-markov-chain/ Markov Chains: Why Walk When You Can Flow?Richard McElreath28 November 2017 “If you are still using a Gibbs sampler, you are working too hard for too little result. Newer,Continue readingWhat I Read: Markov Chains
https://www.kdnuggets.com/2023/03/first-open-source-implementation-deepmind-alphatensor.html First Open Source Implementation of DeepMind’s AlphaTensorDiego Fiori, Co-founder & CTO at NebulyMarch 10, 2023 “The first open-source implementation of AlphaTensor has been released and opens the door forContinue readingWhat I Read: Open Source, AlphaTensor
https://yang-song.github.io/blog/2021/score/ Generative Modeling by Estimating Gradients of the Data DistributionAuthor Yang SongDate May 5, 2021 “This blog post gives a detailed introduction to score-based generative models. We demonstrate that thisContinue readingWhat I Read: Generative Modeling by Estimating Gradients
https://www.pymc-labs.io/blog-posts/pymc-stan-benchmark/ MCMC for big datasets: faster sampling with JAX and the GPUScaling PyMC using JAX to sample on the GPUAuthored by Martin Ingram on 2021-12-22 “You’ll often hear people sayContinue readingWhat I Read: MCMC for big datasets