Iterative-DualRL: An Optimal Discriminator Weighted Imitation Perspective for Reinforcement Learning

1UT Austin 2UMass Amherst 2Meta AI
* Equal contribution

Under construction.