Abstract Many robotics problems are naturally formulated such that the extrinsic rewards to the agent are either sparse or missing altogether. These problem...
This week I learned about Exploration and Intrinsic Motivation.
This post extends my learning about Actor-Critic algorithms to the off-policy setting.
This post uses Deep Q Networks to introduce off-policy algorithms
This post introduces Actor-Critic Algorithms as an extension of basic policy gradient algorithms such as REINFORCE.