Counterfactual Policy Evaluation of Model Performance

    公开(公告)号:US20240220859A1

    公开(公告)日:2024-07-04

    申请号:US18393349

    申请日:2023-12-21

    Applicant: Maplebear Inc.

    CPC classification number: G06N20/00 G06Q30/01

    Abstract: An online system uses an offline iterative clustering process to evaluate the performance of a set of content selection frameworks. To perform an iteration of the iterative clustering process, an online system clusters the testing example data into a set of clusters. An online system computes a set of framework scores for each of the generated clusters. An online system computes an improvement score for each cluster based on the performance scores of the clusters. To determine whether to perform another iteration, an online system computes an aggregated improvement score based on the improvement scores of the clusters. If an online system determines that the aggregated improvement score does not meet the threshold, an online system performs another iteration of the process above. When an online system finishes the iterative process, an online system outputs the improvement scores of the most-recent iteration.

    AUTOMATED POLICY FUNCTION ADJUSTMENT USING REINFORCEMENT LEARNING ALGORITHM

    公开(公告)号:US20230298080A1

    公开(公告)日:2023-09-21

    申请号:US18108916

    申请日:2023-02-13

    CPC classification number: G06Q30/0617 G06N3/092

    Abstract: An online system may receive, from a content provider, a content presentation campaign that includes one or more objectives. The online system may define a set of one or more policy functions that automatically controls the content presentation campaign. A policy function may control one or more criteria in bidding content slots. The online system may monitor a realized outcome of the content presentation campaign. The online system may apply a reinforcement learning algorithm in adjusting the set of policy functions. The reinforcement learning algorithm adjusts one or more parameters in the set of policy functions to reduce a difference between the realized outcome and the desired outcome set by the content provider. The online system generates an adjusted set of policy functions and uses the adjusted set of policy functions in bidding content slots to present one or more content items provided by the content provider.

    TREATMENT LIFT SCORE AGGREGATION FOR NEW TREATMENT TYPES

    公开(公告)号:US20230368236A1

    公开(公告)日:2023-11-16

    申请号:US17744526

    申请日:2022-05-13

    CPC classification number: G06Q30/0211 G06Q30/0239 G06Q30/0617

    Abstract: An online concierge system uses a new treatment engine to score users for applying treatments of a new treatment type. The new treatment engine uses treatment models to generate treatment lift scores for the user. The new treatment engine applies an aggregation function model to the treatment lift scores to generate an aggregated lift score for the user. If the aggregated lift score exceeds a threshold, the new treatment engine applies a treatment of the new treatment type to the user. The new treatment engine trains the aggregation function model based on training examples used to train the treatment models. For a training example associated with a particular treatment type, the new treatment engine uses a target lift score generated by the treatment model for the treatment type to evaluate the performance of the aggregation function model, and to update the aggregation function model accordingly.

Patent Agency Ranking