Skip to content

Andrew Fairless, Ph.D.

Data, Science, and Tinkering

Overview
Experience and Education
Publications
SHAP Tutorial
Understanding the Basics of Bayesian Linear Regression
Classifying Medicine
The Peanuts Project

Search for:

Search for:

What I Read: group relative policy optimization

Home/What I Learn/What I Read: group relative policy optimization

By BylineAndrew Fairless on May 22, 2025February 22, 2025

https://superb-makemake-3a4.notion.site/group-relative-policy-optimization-GRPO-18c41736f0fd806eb39dc35031758885

group relative policy optimization (GRPO)
Apoorv Nandan
Jan 31, 2025

“GRPO became popular primarily due to the success of deepseek r1, which used this algorithm to train reasoning capabilities into their base language model.”

Cat Links What I Learn Tag Links large language model loss machine learning neural network optimization policy reinforcement learning reward training

Post navigation

What I Read: Reasoning LLMsPrev post

What I Read: RL, PPO, GRPONext post

Categories

Bayesian statistics Machine Learning Statistics What I Learn What I Make

Tags

artificial intelligence attention Bayesian chatbot cloud cognition computer vision database data engineering data science deployment DevOps efficiency embedding generalization generative GPU graph healthcare image interpretability large language model latency linear algebra machine learning medicine MLOps monitoring natural language processing neural network neuroscience optimization pipeline probability Python recurrent regression reinforcement learning scalability software engineering SQL statistics training transformer unit test

Copyright © 2025 Andrew Fairless, Ph.D.. All Rights Reserved. | Simple Persona by Catch Themes

Scroll Up