Online learning and vehicle control method based on reinforcement learning without active exploration
Abstract:
A computer-implemented method of adaptively controlling an autonomous operation of a vehicle is provided. The method includes steps of (a) in a critic network in a computing system configured to autonomously control the vehicle, determining, using samples of passively collected data and a state cost, an estimated average cost, and an approximated cost-to-go function that produces a minimum value for a cost-to-go of the vehicle when applied by an actor network; and (b) in an actor network in the computing system and operatively coupled to the critic network, determining a control input to apply to the vehicle that produces the minimum value for the cost-to-go, wherein the actor network is configured to determine the control input by estimating a noise level using the average cost, a cost-to-go determined from the approximated cost-to-go function, a control dynamics for a current state of the vehicle, and the passively collected data.
Information query
Patent Agency Ranking
0/0