Reinforcement learning method, recording medium, and reinforcement learning system

Invention Grant

US11543789B2 Reinforcement learning method, recording medium, and reinforcement learning system 有权

Please log in to see more content

Patent Title: Reinforcement learning method, recording medium, and reinforcement learning system
Application No.: US16797515

Application Date: 2020-02-21
Publication No.: US11543789B2

Publication Date: 2023-01-03
Inventor: Yoshihiro Okawa , Tomotake Sasaki , Hidenao Iwane , Hitoshi Yanami
Applicant: FUJITSU LIMITED
Applicant Address: JP Kawasaki
Assignee: FUJITSU LIMITED
Current Assignee: FUJITSU LIMITED
Current Assignee Address: JP Kawasaki
Agency: Xsensus LLP
Priority: JPJP2019-039032 20190304
Main IPC: G05B19/04
IPC: G05B19/04 ; G05B19/042 ; G06N7/00 ; G06N20/00

Reinforcement learning method, recording medium, and reinforcement learning system

Abstract:

A reinforcement learning method executed by a computer includes calculating a degree of risk for a state of a controlled object at a current time point with respect to a constraint condition related to the state of the controlled object, the degree of risk being calculated based on a predicted value of the state of the controlled object at a future time point, the predicted value being obtained from model information defining a relationship between the state of the controlled object and a control input to the controlled object; and determining the control input to the controlled object at the current time point, from a range defined according to the calculated degree of risk so that the range becomes narrower as the calculated degree of risk increases.

Public/Granted literature

US20200285208A1 REINFORCEMENT LEARNING METHOD, RECORDING MEDIUM, AND REINFORCEMENT LEARNING SYSTEM Public/Granted day:2020-09-10

Information query

Espacenet

IPC分类:

G	物理
G05	控制；调节
G05B	一般的控制或调节系统；这种系统的功能单元；用于这种系统或单元的监视或测试装置（应用流体作用的一般流体压力执行器或系统入F15B；阀门本身入F16K；仅按机械特征区分的入G05G；传感元件见相应小类，例如G12B，G01、H01的小类；校正单元见相应的小类，例如H02K）
G05B19/00	程序控制系统（特殊应用见有关位置，例如A47L15/46；附带或内装有在预定时间间隔操作任一器件的装置的时钟入G04C23/00；记录或读取数字信息的记录载体入G06K；信息存储器入G11；在程序执行完了后自动终止其运行的时间或时间程序开关入H01H43/00）
G05B19/02	.电的
G05B19/04	..除数字控制外的程序控制，即顺序控制器或逻辑控制器（G05B19/418优先；数字控制入G05B19/18）