Learning Vector Policy Fields for Continuous Control
My work on Reinforcement Learning Networks began to gain momentum recently; along with the Value Iteration Networks from Aviv Tamar, and the QMDP-Net from Peter Karkus.
I introduced Deep Vector Policy Fields (DVPF), an extension to this framework to the problem of continuous control of a real-world quadcopter. At the core of DVPF is the idea of representing a policy as a vector field, while interpolating states and actions. I then observed a connection between interpolating states and actions, and the probability models that reinforcement learning networks learns.
Here's a cool video of some of the results of DVPF on both a real and simulated quadcopter:
I introduced Deep Vector Policy Fields (DVPF), an extension to this framework to the problem of continuous control of a real-world quadcopter. At the core of DVPF is the idea of representing a policy as a vector field, while interpolating states and actions. I then observed a connection between interpolating states and actions, and the probability models that reinforcement learning networks learns.
Here's a cool video of some of the results of DVPF on both a real and simulated quadcopter:
The details of the framework are in the paper below.
tanmay_deepvectorpolicyfields.pdf | |
File Size: | 3913 kb |
File Type: |