Reinforcement learning for guaranteed delivery of supplemental content
Abstract:
In some embodiments, a method receives a request for supplemental content to be provided in association with main content. The method selects an instance of supplemental content based on a long-term reward metric and a short-term reward metric. The long-term reward metric is based on feedback from delivery of a plurality of instances of supplemental content and a delivery status for a delivery constraint of one instance of supplemental content. The short-term reward metric is based on feedback from delivery of the one instance of supplemental content. The long-term reward metric is based on feedback from delivery of a plurality of instances of supplemental content and the short-term reward metric is based on feedback from delivery of one instance of supplemental content. The instance of supplemental content is provided to a client device.
Information query
Patent Agency Ranking
0/0