The Interface Between Reinforcement Learning Theory and Language Model Post-Training The Interface Between Reinforcement Learning Theory and Language Model Post-TrainingAkshay Krishnamurthy, Audrey HuangMarch 5, 2025 “Even though existing RLHF methods…