https://www.cs.princeton.edu/~smalladi/blog/2024/04/04/dataselection
Using LESS Data to Tune Models: Data Selection in the Era of LLMs
Mengzhou Xia and Sadhika Malladi
April 04 2024
“We describe how data selection for modern-day LLMs differs from prior settings and how our algorithm, LESS, effectively selects relevant data to cultivate specific capabilities in models during instruction tuning.”