Reinforcement learning method for driver incentives: generative adversarial network for driver-system interactions

Invention Grant

US11861643B2 Reinforcement learning method for driver incentives: generative adversarial network for driver-system interactions 有权

Please log in to see more content

Patent Title: Reinforcement learning method for driver incentives: generative adversarial network for driver-system interactions
Application No.: US17618864

Application Date: 2019-06-14
Publication No.: US11861643B2

Publication Date: 2024-01-02
Inventor: Wenjie Shang , Qingyang Li , Zhiwei Qin , Yiping Meng , Yang Yu , Jieping Ye
Applicant: BEIJING DIDI INFINITY TECHNOLOGY AND DEVELOPMENT CO., LTD.
Applicant Address: CN Beijing
Assignee: BEIJING DIDI INFINITY TECHNOLOGY AND DEVELOPMENT CO., LTD.
Current Assignee: BEIJING DIDI INFINITY TECHNOLOGY AND DEVELOPMENT CO., LTD.
Current Assignee Address: CN Beijing
Agency: METIS IP LLC
International Application: PCT/CN2019/091255 2019.06.14
International Announcement: WO2020/248223A 2020.12.17
Date entered country: 2021-12-13
Main IPC: G06N3/08
IPC: G06N3/08 ; G06Q50/30 ; G06Q30/0211 ; G06Q30/0208 ; G06Q30/0207

Reinforcement learning method for driver incentives: generative adversarial network for driver-system interactions

Abstract:

A system and method of determining a policy to prevent fading drivers is described. The system and method creates virtual trajectories of incentives such as coupons offered to drivers in a transportation hailing system and corresponding states of drivers in response to the incentives. A joint policy simulator is created from an incentive policy, a confounding incentive policy, and an incentive object policy to generate the simulated actions of drivers in response to different incentives. The rewards of the simulated actions of the drivers is determined by a discriminator. The incentive policy for preventing fading drivers is optimized by reinforcement learning based on the virtual trajectories generated by the joint policy simulator and discriminator.

Public/Granted literature

US20220261833A1 Reinforcement Learning Method For Driver Incentives: Generative Adversarial Network For Driver-System Interactions Public/Granted day:2022-08-18

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06N	基于特定计算模型的计算机系统
G06N3/00	基于生物学模型的计算机系统
G06N3/02	.采用神经网络模型
G06N3/08	..学习方法