| 10867242 |
Selecting actions to be performed by a reinforcement learning agent using tree search |
Thore Graepel, Shih-Chieh Huang, Arthur Clement Guez, Laurent Sifre, Ilya Sutskever +1 more |
2020-12-15 |
| 10860926 |
Meta-gradient updates for training return functions for reinforcement learning systems |
Zhongwen Xu, Hado Philip van Hasselt |
2020-12-08 |
| 10776692 |
Continuous control with deep reinforcement learning |
Timothy Paul Lillicrap, Jonathan James Hunt, Alexander Pritzel, Nicolas Manfred Otto Heess, Tom Erez +2 more |
2020-09-15 |
| 10733501 |
Environment prediction using reinforcement learning |
Tom Schaul, Matteo Hessel, Hado Philip van Hasselt |
2020-08-04 |
| 10650310 |
Training neural networks using a prioritized experience memory |
Tom Schaul, John Quan |
2020-05-12 |
| 10628733 |
Selecting reinforcement learning actions using goals and observations |
Tom Schaul, Daniel George Horgan, Karol Gregor |
2020-04-21 |