AUTOMATED POLICY FUNCTION ADJUSTMENT USING REINFORCEMENT LEARNING ALGORITHM

    公开(公告)号:US20230298080A1

    公开(公告)日:2023-09-21

    申请号:US18108916

    申请日:2023-02-13

    CPC classification number: G06Q30/0617 G06N3/092

    Abstract: An online system may receive, from a content provider, a content presentation campaign that includes one or more objectives. The online system may define a set of one or more policy functions that automatically controls the content presentation campaign. A policy function may control one or more criteria in bidding content slots. The online system may monitor a realized outcome of the content presentation campaign. The online system may apply a reinforcement learning algorithm in adjusting the set of policy functions. The reinforcement learning algorithm adjusts one or more parameters in the set of policy functions to reduce a difference between the realized outcome and the desired outcome set by the content provider. The online system generates an adjusted set of policy functions and uses the adjusted set of policy functions in bidding content slots to present one or more content items provided by the content provider.

    TRAINING A MODEL TO PREDICT LIKELIHOODS OF USERS PERFORMING AN ACTION AFTER BEING PRESENTED WITH A CONTENT ITEM

    公开(公告)号:US20220398605A1

    公开(公告)日:2022-12-15

    申请号:US17343026

    申请日:2021-06-09

    Abstract: An online concierge system trains a user interaction model to predict a probability of a user performing an interaction after one or more content items are displayed to the user. This provides a measure of an effect of displaying content items to the user on the user performing one or more interactions. The user interaction model is trained from displaying content items to certain users of the online concierge system and withholding display of the content items to other users of the online concierge system. To train the user interaction model, the user interaction model is applied to labeled examples identifying a user and value based on interactions the user performed after one or more content items were displayed to the user and interactions the user performed when one or more content items were not used.

    TRAINING A MODEL TO PREDICT LIKELIHOODS OF USERS PERFORMING AN ACTION AFTER BEING PRESENTED WITH A CONTENT ITEM

    公开(公告)号:US20250095007A1

    公开(公告)日:2025-03-20

    申请号:US18967681

    申请日:2024-12-04

    Applicant: Maplebear Inc.

    Abstract: An online concierge system trains a user interaction model to predict a probability of a user performing an interaction after one or more content items are displayed to the user. This provides a measure of an effect of displaying content items to the user on the user performing one or more interactions. The user interaction model is trained from displaying content items to certain users of the online concierge system and withholding display of the content items to other users of the online concierge system. To train the user interaction model, the user interaction model is applied to labeled examples identifying a user and value based on interactions the user performed after one or more content items were displayed to the user and interactions the user performed when one or more content items were not used.

    Training a model to predict likelihoods of users performing an action after being presented with a content item

    公开(公告)号:US11593819B2

    公开(公告)日:2023-02-28

    申请号:US17343026

    申请日:2021-06-09

    Abstract: An online concierge system trains a user interaction model to predict a probability of a user performing an interaction after one or more content items are displayed to the user. This provides a measure of an effect of displaying content items to the user on the user performing one or more interactions. The user interaction model is trained from displaying content items to certain users of the online concierge system and withholding display of the content items to other users of the online concierge system. To train the user interaction model, the user interaction model is applied to labeled examples identifying a user and value based on interactions the user performed after one or more content items were displayed to the user and interactions the user performed when one or more content items were not used.

    GENERATING USER-SPECIFIC INCENTIVES BASED ON PREVIOUS ACTIVITY USING MACHINE-LEARNED LARGE LANGUAGE MODELS (LLMS)

    公开(公告)号:US20250139657A1

    公开(公告)日:2025-05-01

    申请号:US18932041

    申请日:2024-10-30

    Applicant: Maplebear Inc.

    Abstract: An online system accesses user behavior data and incentive data collected for a user prior to a current time period. The online system trains a behavior prediction model to receive user behavior data for a user and an incentive and output an incentive score using the collected user behavior data. The online system receives one or more candidate incentives generated by an incentive generation model based on the accessed user behavior data and incentive data. The online system applies each candidate incentive to the behavior prediction model to generate an incentive prediction describing a degree of user interaction of the particular user with the online system responsive to offering the candidate incentive to the user. The online system offers one or more candidate incentives to the user based on the determined incentive predictions.

    Attributing Loss of Engagement with an Online System Using Temporal Partitioning of Training Data for a Churn Prediction Model

    公开(公告)号:US20250061350A1

    公开(公告)日:2025-02-20

    申请号:US18233828

    申请日:2023-08-14

    Applicant: Maplebear Inc.

    Abstract: An online system trains a churn prediction model to attribute a churn event to one or more causal events. The churn prediction model receives customer features and online system features as inputs. Various causal events that occur affect one or more online system features. To avoid biasing the churn prediction model using input features that are related to possible causal events, the online system determines customer features and online system features based on customer interactions occurring in different time intervals. The customer features are determined from interactions in a time interval that is earlier than a time interval from which interactions are used to determine online system features. Such time segmenting decorrelates the features input to the model from the events, reducing potential bias from the causal events on the churn prediction model.

Patent Agency Ranking