Invention Grant
- Patent Title: Training action selection neural networks using off-policy actor critic reinforcement learning
-
Application No.: US16402687Application Date: 2019-05-03
-
Publication No.: US10706352B2Publication Date: 2020-07-07
- Inventor: Ziyu Wang , Nicolas Manfred Otto Heess , Victor Constant Bapst
- Applicant: DeepMind Technologies Limited
- Applicant Address: GB London
- Assignee: DeepMind Technologies Limited
- Current Assignee: DeepMind Technologies Limited
- Current Assignee Address: GB London
- Agency: Fish & Richardson P.C.
- Main IPC: G06N3/04
- IPC: G06N3/04 ; G06N3/08 ; G06N3/00

Abstract:
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network. One of the methods includes maintaining a replay memory that stores trajectories generated as a result of interaction of an agent with an environment; and training an action selection neural network having policy parameters on the trajectories in the replay memory, wherein training the action selection neural network comprises: sampling a trajectory from the replay memory; and adjusting current values of the policy parameters by training the action selection neural network on the trajectory using an off-policy actor critic reinforcement learning technique.
Public/Granted literature
- US20190258918A1 TRAINING ACTION SELECTION NEURAL NETWORKS Public/Granted day:2019-08-22
Information query