What I Read: Reducing Toxicity in Language Models

https://lilianweng.github.io/lil-log/2021/03/21/reducing-toxicity-in-language-models.html

Reducing Toxicity in Language Models
by Lilian Weng
Mar 21, 2021

“Large pretrained language models are trained over a sizable collection of online data. They unavoidably acquire certain toxic behavior and biases from the Internet…. Many challenges are associated with the effort to diminish various types of unsafe content…”