Invention Grant
- Patent Title: Training action selection neural networks using a differentiable credit function
-
Application No.: US16615042Application Date: 2018-05-22
-
Publication No.: US11651208B2Publication Date: 2023-05-16
- Inventor: Zhongwen Xu , Hado Phillip van Hasselt , Joseph Varughese Modayil , Andre da Motta Salles Barreto , David Silver
- Applicant: DEEPMIND TECHNOLOGIES LIMITED
- Applicant Address: GB London
- Assignee: DeepMind Technologies Limited
- Current Assignee: DeepMind Technologies Limited
- Current Assignee Address: GB London
- Agency: Fish & Richardson P.C.
- International Application: PCT/EP2018/063279 2018.05.22
- International Announcement: WO2018/211139A 2018.11.22
- Date entered country: 2019-11-19
- Main IPC: G06N3/08
- IPC: G06N3/08 ; G06N3/04

Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for reinforcement learning. A reinforcement learning neural network selects actions to be performed by an agent interacting with an environment to perform a task in an attempt to achieve a specified result. The reinforcement learning neural network has at least one input to receive an input observation characterizing a state of the environment and at least one output for determining an action to be performed by the agent in response to the input observation. The system includes a reward function network coupled to the reinforcement learning neural network. The reward function network has an input to receive reward data characterizing a reward provided by one or more states of the environment and is configured to determine a reward function to provide one or more target values for training the reinforcement learning neural network.
Public/Granted literature
- US20200175364A1 TRAINING ACTION SELECTION NEURAL NETWORKS USING A DIFFERENTIABLE CREDIT FUNCTION Public/Granted day:2020-06-04
Information query