Reinforcement learning for quadruped robot using Isaac Gym

The selection of lower-level controllers are nontrivial and task dependent. this paper proposed a network architecture that learns to combine multiple low-level controllers. Results shown better performance than traditional hybrid controller but worse than the MPC implementation on the velocity tracking task.