Tag: policy gradient

What I Read: How Generally Capable Agents Trained

https://www.lesswrong.com/posts/DreKBuMvK7fdESmSJ/how-deepmind-s-generally-capable-agents-were-trained How DeepMind’s Generally Capable Agents Were Trainedby 1a3orn20th Aug 2021 “One of DeepMind’s latest papers… explains how DeepMind produced agents that can successfully play games as complex as hide-and-seekContinue readingWhat I Read: How Generally Capable Agents Trained