- Patent Title: Learning method and learning device for supporting reinforcement learning by using human driving data as training data to thereby perform personalized path planning
-
Application No.: US16740135Application Date: 2020-01-10
-
Publication No.: US11074480B2Publication Date: 2021-07-27
- Inventor: Kye-Hyeon Kim , Yongjoong Kim , Hak-Kyoung Kim , Woonhyun Nam , SukHoon Boo , Myungchul Sung , Dongsoo Shin , Donghun Yeo , Wooju Ryu , Myeong-Chun Lee , Hyungsoo Lee , Taewoong Jang , Kyungjoong Jeong , Hongmo Je , Hojin Cho
- Applicant: Stradvision, Inc.
- Applicant Address: KR Pohang-si
- Assignee: Stradvision, Inc.
- Current Assignee: Stradvision, Inc.
- Current Assignee Address: KR Pohang-si
- Agency: Kaplan Breyer Schwarz, LLP
- Main IPC: G06K9/62
- IPC: G06K9/62 ; G05D1/00 ; G05D1/02 ; G06T15/20 ; G06T17/05

Abstract:
A learning method for acquiring at least one personalized reward function, used for performing a Reinforcement Learning (RL) algorithm, corresponding to a personalized optimal policy for a subject driver is provided. And the method includes steps of: (a) a learning device performing a process of instructing an adjustment reward network to generate first adjustment rewards, by referring to the information on actual actions and actual circumstance vectors in driving trajectories, a process of instructing a common reward module to generate first common rewards by referring to the actual actions and the actual circumstance vectors, and a process of instructing an estimation network to generate actual prospective values by referring to the actual circumstance vectors; and (b) the learning device instructing a first loss layer to generate an adjustment reward and to perform backpropagation to learn parameters of the adjustment reward network.
Public/Granted literature
Information query