https://superb-makemake-3a4.notion.site/group-relative-policy-optimization-GRPO-18c41736f0fd806eb39dc35031758885 group relative policy optimization (GRPO)Apoorv NandanJan 31, 2025 “GRPO became popular primarily due to the success of deepseek r1, which used this algorithm to train reasoning capabilities into their
What I Read: optimizing softmax
https://maharshi.bearblog.dev/optimizing-softmax-cuda Learning CUDA by optimizing softmax: A worklogMaharshi Pandya04 Jan, 2025 “Optimizing softmax, especially in the context of GPU programming with CUDA, presents many opportunities for learning.”
What I Read: Multi Objective Optimisation
https://blog.flipkart.tech/multi-objective-optimisation-in-suggestions-ranking-flipkart-49099b951eae?gi=04415d605535 Multi Objective Optimisation in Suggestions Ranking @ FlipkartPranjal SanjanwalaApr 19, 2024 “…we aim to provide a perfectly tailored set of suggestions for that user at that point in time.
What I Read: reliance on AI-assisted decisions
https://statmodeling.stat.columbia.edu/2024/03/06/defining-optimal-reliance-on-model-predictions-in-ai-assisted-decisions/Defining optimal reliance on model predictions in AI-assisted decisionsJessica Hullman3/6/24 12:31 PM “…AI-assisted decision task is of interest as organizations deploy predictive models to assist human decision-making in domains like
What I Read: Diffusion models, new theoretical perspective
https://www.chenyang.co/diffusion.html Diffusion models from scratch, from a new theoretical perspectiveChenyang Yuan2023 “This tutorial aims to introduce diffusion models from an optimization perspective…”