[RL] DAC: The Double Actor-Critic Architecture for Learning Options

Date: November 22, 2022

The option framework is reformulated as two parallel augmented MDPs. under this new formulation, all policy optimization algorithms are readily available for learning intra-option policy, termination policy, and master option. we apply AC algorithms on each augmented MDP and The DAC architecture is designed. Combined with the PPO algorithm, an empirical study is conducted on challenging robot simulation tasks.

More information here

Twitter Facebook LinkedIn

Jiarun Liu