States simulator for reinforcement learning models
Abstract:
A method, apparatus and a product for generating a dataset for a reinforcement model. The method comprises obtaining a plurality of different subsets of the set of features; for each subset of features, determining a policy using a Markov Decision Process; obtaining a state comprises a valuation of each feature of the set of features; applying the plurality of policies on the state, whereby obtaining a plurality of suggested actions for the state, based on different projections of the state onto different subsets of features; determining, for the state, one or more actions and corresponding scores thereof based on the plurality of suggested actions; and training a reinforcement learning model using the state and the one or more actions and corresponding scores thereof.
Public/Granted literature
Information query
Patent Agency Ranking
0/0