Recording medium, reinforcement learning method, and reinforcement learning apparatus

Invention Grant

US11645574B2 Recording medium, reinforcement learning method, and reinforcement learning apparatus 有权

Please log in to see more content

Patent Title: Recording medium, reinforcement learning method, and reinforcement learning apparatus
Application No.: US16130482

Application Date: 2018-09-13
Publication No.: US11645574B2

Publication Date: 2023-05-09
Inventor: Tomotake Sasaki , Eiji Uchibe , Kenji Doya , Hirokazu Anai , Hitoshi Yanami , Hidenao Iwane
Applicant: FUJITSU LIMITED , Okinawa Institute of Science and Technology School Corporation
Applicant Address: JP Kawasaki
Assignee: FUJITSU LIMITED KAWASAKI, JAPAN,OKINAWA INSTITUTE OF SCIENCE AND TECHNOLOGY SCHOOL CORPORATION
Current Assignee: FUJITSU LIMITED KAWASAKI, JAPAN,OKINAWA INSTITUTE OF SCIENCE AND TECHNOLOGY SCHOOL CORPORATION
Current Assignee Address: JP Kawasaki; JP Okinawa
Agency: Staas & Halsey LLP
Priority: JP 2017177970 2017.09.15
Main IPC: G06N3/00
IPC: G06N3/00 ; G06N20/00 ; H04L41/0816 ; G06N3/006 ; H04L41/16 ; H04L43/08

Recording medium, reinforcement learning method, and reinforcement learning apparatus

Abstract:

A non-transitory, computer-readable recording medium stores therein a reinforcement learning program that uses a value function and causes a computer to execute a process comprising: estimating first coefficients of the value function represented in a quadratic form of inputs at times in the past than a present time and outputs at the present time and the times in the past, the first coefficients being estimated based on inputs at the times in the past, the outputs at the present time and the times in the past, and costs or rewards that corresponds to the inputs at the times in the past; and determining second coefficients that defines a control law, based on the value function that uses the estimated first coefficients and determining input values at times after estimation of the first coefficients.

Public/Granted literature

US20190087751A1 RECORDING MEDIUM, REINFORCEMENT LEARNING METHOD, AND REINFORCEMENT LEARNING APPARATUS Public/Granted day:2019-03-21

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06N	基于特定计算模型的计算机系统
G06N3/00	基于生物学模型的计算机系统