| 12299574 |
Distributed training using actor-critic reinforcement learning with off-policy correction factors |
Hubert Josef Soyer, Lasse Espeholt, Karen Simonyan, Yotam Doron, Vlad Firoiu +5 more |
2025-05-13 |
| 11977983 |
Noisy neural network layers with noise parameters |
Mohammad Gheshlaghi Azar, Meire Fortunato, Bilal Piot, Olivier Claude Pietquin, Jacob Lee Menick +2 more |
2024-05-07 |
| 11868894 |
Distributed training using actor-critic reinforcement learning with off-policy correction factors |
Hubert Josef Soyer, Lasse Espeholt, Karen Simonyan, Yotam Doron, Vlad Firoiu +5 more |
2024-01-09 |
| 11727264 |
Reinforcement learning using pseudo-counts |
Marc Gendron-Bellemare, Srinivasan Sriram |
2023-08-15 |
| 11604997 |
Training action selection neural networks using leave-one-out-updates |
Marc Gendron-Bellemare, Mohammad Gheshlaghi Azar, Audrunas Gruslys |
2023-03-14 |
| 11593646 |
Distributed training using actor-critic reinforcement learning with off-policy correction factors |
Hubert Josef Soyer, Lasse Espeholt, Karen Simonyan, Yotam Doron, Vlad Firoiu +5 more |
2023-02-28 |
| 11256990 |
Memory-efficient backpropagation through time |
Marc Lanctot, Audrunas Gruslys, Ivo Danihelka |
2022-02-22 |
| 10936949 |
Training machine learning models using task selection policies to increase learning progress |
Marc Gendron-Bellemare, Jacob Lee Menick, Alexander Benjamin Graves, Koray Kavukcuoglu |
2021-03-02 |
| 10839293 |
Noisy neural network layers with noise parameters |
Mohammad Gheshlaghi Azar, Meire Fortunato, Bilal Piot, Olivier Claude Pietquin, Jacob Lee Menick +2 more |
2020-11-17 |