Your Policy Regulariser is Secretly an Adversary
DeepMind Safety Research
Mar 24
By Rob Brekelmans, Tim Genewein, Jordi Grau-Moya, Grégoire Delétang, Markus Kunesch, Shane Legg, Pedro A. Ortega
“Policy regularisation can be interpreted as learning a strategy in the face of an imagined adversary; a decision-making principle which leads to robust policies.”