Online learning and vehicle control method based on reinforcement learning without active exploration

Invention Grant

US10065654B2 Online learning and vehicle control method based on reinforcement learning without active exploration 有权

Please log in to see more content

Patent Title: Online learning and vehicle control method based on reinforcement learning without active exploration
Application No.: US15205558

Application Date: 2016-07-08
Publication No.: US10065654B2

Publication Date: 2018-09-04
Inventor: Tomoki Nishi
Applicant: Toyota Motor Engineering & Manufacturing North America, Inc.
Applicant Address: US TX Plano
Assignee: Toyota Motor Engineering & Manufacturing North America, Inc.
Current Assignee: Toyota Motor Engineering & Manufacturing North America, Inc.
Current Assignee Address: US TX Plano
Agency: Darrow Mustafa PC
Agent Christopher G. Darrow
Main IPC: G05D1/00
IPC: G05D1/00 ; B60W50/06 ; G05B13/04 ; G05B13/02 ; B60W50/00

Online learning and vehicle control method based on reinforcement learning without active exploration

Abstract:

A computer-implemented method of adaptively controlling an autonomous operation of a vehicle is provided. The method includes steps of (a) in a critic network in a computing system configured to autonomously control the vehicle, determining, using samples of passively collected data and a state cost, an estimated average cost, and an approximated cost-to-go function that produces a minimum value for a cost-to-go of the vehicle when applied by an actor network; and (b) in an actor network in the computing system and operatively coupled to the critic network, determining a control input to apply to the vehicle that produces the minimum value for the cost-to-go, wherein the actor network is configured to determine the control input by estimating a noise level using the average cost, a cost-to-go determined from the approximated cost-to-go function, a control dynamics for a current state of the vehicle, and the passively collected data.

Public/Granted literature

US20180009445A1 ONLINE LEARNING AND VEHICLE CONTROL METHOD BASED ON REINFORCEMENT LEARNING WITHOUT ACTIVE EXPLORATION Public/Granted day:2018-01-11

Information query

Espacenet

IPC分类:

G	物理
G05	控制；调节
G05D	非电变量的控制或调节系统（金属的连续铸造入B22D11/16；阀门本身入F16K；非电变量的检测见G01各有关小类；电或磁变量的调节入G05F）
G05D1/00	陆地、水上、空中或太空中的运载工具的位置、航道、高度或姿态的控制，例如自动驾驶仪（无线电导航系统或使用其他波的类似系统入G01S）