Invention Grant
- Patent Title: Recording medium, reinforcement learning method, and reinforcement learning apparatus
-
Application No.: US16130482Application Date: 2018-09-13
-
Publication No.: US11645574B2Publication Date: 2023-05-09
- Inventor: Tomotake Sasaki , Eiji Uchibe , Kenji Doya , Hirokazu Anai , Hitoshi Yanami , Hidenao Iwane
- Applicant: FUJITSU LIMITED , Okinawa Institute of Science and Technology School Corporation
- Applicant Address: JP Kawasaki
- Assignee: FUJITSU LIMITED KAWASAKI, JAPAN,OKINAWA INSTITUTE OF SCIENCE AND TECHNOLOGY SCHOOL CORPORATION
- Current Assignee: FUJITSU LIMITED KAWASAKI, JAPAN,OKINAWA INSTITUTE OF SCIENCE AND TECHNOLOGY SCHOOL CORPORATION
- Current Assignee Address: JP Kawasaki; JP Okinawa
- Agency: Staas & Halsey LLP
- Priority: JP 2017177970 2017.09.15
- Main IPC: G06N3/00
- IPC: G06N3/00 ; G06N20/00 ; H04L41/0816 ; G06N3/006 ; H04L41/16 ; H04L43/08

Abstract:
A non-transitory, computer-readable recording medium stores therein a reinforcement learning program that uses a value function and causes a computer to execute a process comprising: estimating first coefficients of the value function represented in a quadratic form of inputs at times in the past than a present time and outputs at the present time and the times in the past, the first coefficients being estimated based on inputs at the times in the past, the outputs at the present time and the times in the past, and costs or rewards that corresponds to the inputs at the times in the past; and determining second coefficients that defines a control law, based on the value function that uses the estimated first coefficients and determining input values at times after estimation of the first coefficients.
Public/Granted literature
- US20190087751A1 RECORDING MEDIUM, REINFORCEMENT LEARNING METHOD, AND REINFORCEMENT LEARNING APPARATUS Public/Granted day:2019-03-21
Information query
IPC分类:
G | 物理 |
G06 | 计算;推算或计数 |
G06N | 基于特定计算模型的计算机系统 |
G06N3/00 | 基于生物学模型的计算机系统 |