https://magazine.sebastianraschka.com/p/llm-training-rlhf-and-its-alternatives LLM Training: RLHF and Its AlternativesSebastian Raschka, PhDSep 10, 2023 “RLHF is an integral part of the modern LLM training pipeline due to its ability to incorporate human preferences
What I Read: AIs producing own training data
https://thegradient.pub/software2-a-new-generation-of-ais-that-become-increasingly-general-by-producing-their-own-training-data/ Software²: A new generation of AIs that become increasingly general by producing their own training dataMinqi Jiang22.Apr.2023 “We are at the cusp of transitioning from “learning from data” to
What I Read: human touch, LLMs
https://mewelch.substack.com/p/putting-the-human-touch-on-llms Putting the human touch on LLMsMolly WelchMar 30 “Techniques like RLHF help align large language models with people’s values and preferences. Is that a good thing?”
What I Read: Teach Computers Math
https://www.quantamagazine.org/to-teach-computers-math-researchers-merge-ai-approaches-20230215/ To Teach Computers Math, Researchers Merge AI ApproachesKevin HartnettFebruary 15, 2023 “Large language models still struggle with basic reasoning tasks. Two new papers that apply machine learning to math
What I Read: Machines Learn, Teach Basics
https://www.quantamagazine.org/machines-learn-better-if-we-teach-them-the-basics-20230201/ Machines Learn Better if We Teach Them the BasicsMax G. LevyFebruary 1, 2023 “A wave of research improves reinforcement learning algorithms by pre-training them as if they were human.”