Selecting action slates using reinforcement learning

Invention Grant

US10699187B2 Selecting action slates using reinforcement learning 有权

Please log in to see more content

Patent Title: Selecting action slates using reinforcement learning
Application No.: US15367094

Application Date: 2016-12-01
Publication No.: US10699187B2

Publication Date: 2020-06-30
Inventor: Peter Goran Sunehag
Applicant: DeepMind Technologies Limited
Assignee: DeepMind Technologies Limited
Current Assignee: DeepMind Technologies Limited
Agency: Fish & Richardson P.C.
Main IPC: G06F16/903
IPC: G06F16/903 ; G06N3/08 ; G06F16/9032

Selecting action slates using reinforcement learning

Abstract:

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for selecting action slates using reinforcement learning. One of the methods includes receiving an observation characterizing a current state of an environment; selecting an action slate by processing the observation and a plurality of candidate action slates using a deep neural network, wherein each candidate action slate comprises a respective plurality of actions from the set of actions, and wherein the deep neural network is configured to, for each of the candidate action slates, process the observation and the actions in the candidate action slate to generate a slate Q value for the candidate action slate that is an estimate of a long-term reward resulting from the candidate action slate being provided to the action selector in response to the observation; and providing the selected action slate to an action selector in response to the observation.

Public/Granted literature

US20170154261A1 SELECTING ACTION SLATES USING REINFORCEMENT LEARNING Public/Granted day:2017-06-01

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06F	电数字数据处理（基于特定计算模型的计算机系统入G06N）
G06F16/00	信息检索；数据库结构；文件系统结构
G06F16/90	.•与检索数据类型无关的数据库功能
G06F16/903	..••查询（从网上检索G06F 16/953）