Invention Grant
- Patent Title: Online machine learning with immediate rewards when real rewards are delayed
-
Application No.: US17098829Application Date: 2020-11-16
-
Publication No.: US12056584B2Publication Date: 2024-08-06
- Inventor: Oznur Alkan , Djallel Bouneffouf , Bei Chen , Elizabeth Daly
- Applicant: International Business Machines Corporation
- Applicant Address: US NY Armonk
- Assignee: International Business Machines Corporation
- Current Assignee: International Business Machines Corporation
- Current Assignee Address: US NY Armonk
- Agency: Scully, Scott, Murphy & Presser, P.C.
- Agent Yuanmin Cai
- Main IPC: G06G7/48
- IPC: G06G7/48 ; G06F16/951 ; G06F18/21 ; G06F18/214 ; G06N20/00 ; G16H50/20

Abstract:
An online machine learning model such as an autonomous agent predicts an action. A processor associated with, or running, the online machine learning model observes an environment for an interval of time for a real reward associated with the action. Responsive to determining that the real reward is not received within the interval of time, the processor determines based on a criterion whether to allocate an immediate reward received within the interval of time to the online machine learning model, where the immediate reward is an approximation of the real reward. Responsive to determining that the immediate reward is to be allocated, the processor allocates the immediate reward to the online machine learning model. The online machine learning model further learns or retrains itself based on the immediate reward.
Public/Granted literature
- US20220156637A1 ONLINE MACHINE LEARNING WITH IMMEDIATE REWARDS WHEN REAL REWARDS ARE DELAYED Public/Granted day:2022-05-19
Information query