Invention Grant
- Patent Title: Selecting action slates using reinforcement learning
-
Application No.: US15367094Application Date: 2016-12-01
-
Publication No.: US10699187B2Publication Date: 2020-06-30
- Inventor: Peter Goran Sunehag
- Applicant: DeepMind Technologies Limited
- Assignee: DeepMind Technologies Limited
- Current Assignee: DeepMind Technologies Limited
- Agency: Fish & Richardson P.C.
- Main IPC: G06F16/903
- IPC: G06F16/903 ; G06N3/08 ; G06F16/9032

Abstract:
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for selecting action slates using reinforcement learning. One of the methods includes receiving an observation characterizing a current state of an environment; selecting an action slate by processing the observation and a plurality of candidate action slates using a deep neural network, wherein each candidate action slate comprises a respective plurality of actions from the set of actions, and wherein the deep neural network is configured to, for each of the candidate action slates, process the observation and the actions in the candidate action slate to generate a slate Q value for the candidate action slate that is an estimate of a long-term reward resulting from the candidate action slate being provided to the action selector in response to the observation; and providing the selected action slate to an action selector in response to the observation.
Public/Granted literature
- US20170154261A1 SELECTING ACTION SLATES USING REINFORCEMENT LEARNING Public/Granted day:2017-06-01
Information query