https://www.cs.princeton.edu/~smalladi/blog/2024/07/09/dpo-infinity The Hidden Infinity in Preference LearningSadhika MalladiJuly 09 2024 “I demonstrate from first principles how offline preference learning algorithms (e.g., SimPO) can benefit from length normalization, especially when training
What I Read: Summarization, LLMs
https://cameronrwolfe.substack.com/p/summarization-and-the-evolution-of Summarization and the Evolution of LLMsCameron R. Wolfe, Ph.D.Jun 03, 2024 “How research on abstractive summarization changed language models forever…”
What I Read: Will Scaling Solve Robotics?
https://nishanthjkumar.com/Will-Scaling-Solve-Robotics-Perspectives-from-CoRL-2023/ Will Scaling Solve Robotics?: Perspectives From Corl 2023Nishanth J. Kumar “…is training a large neural network on a very large dataset a feasible way to solve robotics?”
What I Read: AI System Beats Chess Puzzles
https://www.quantamagazine.org/google-deepmind-trains-artificial-brainstorming-in-chess-ai-20231115/ AI System Beats Chess Puzzles With ‘Artificial Brainstorming’Stephen OrnesNovember 15, 2023 “By bringing together disparate approaches, machines can reach a new level of creative problem-solving.”