# SkillBlender RL
We provide implementent [SkillBlender](https://github.com/Humanoid-SkillBlender/SkillBlender) into our framework.
RL algorithm: `PPO` by [rsl_rl](https://github.com/leggedrobotics/rsl_rl) `v1.0.2`
RL learning framework: `hierarchical RL`
Simulator: `IsaacGym`
## Installation
```bash
pip install -e roboverse_learn/rl/rsl_rl
```
## Training
- IssacGym:
```bash
python3 roboverse_learn/skillblender_rl/train.py --task "skillblender:Walking" --sim "isaacgym" --num_envs 1024 --robot "h1_wrist" --use_wandb
```
after training around a few minuts for task `skillblender:Walking` and `skillblender:Stepping`, you can see like this. Note that we should always use `h1_wrist` instead of navie `h1` keep ths wrist links exist.
**To speed up training, click the IsaacGym viewer and press V to stop rendering.**
## Play
After training a few minutes, you can run the following play script
```
python3 roboverse_learn/skillblender_rl/play.py --task skillblender:Reaching --sim isaacgym --robot h1_wrist --load_run 2025_0628_232507 --checkpoint 15000
```
you can see video like this
Skillblender::Reaching
Skillblender::Walking
## Checkpoints
We also provide [checkpoints](https://huggingface.co/RoboVerseOrg/ckeckpionts/blob/main/skillblender_reaching_ckpt.pt
) trained with roboverse humanoid infra. To use it with `roboverse_learn/skillblender_rl/play.py`,
rename the file to `model_xxx.pt` and move it into the appropriate directory, which should have the following layout:
```
outputs/
└── skillblender/
└── h1_wrist_reaching/ # Task name
└── 2025_0628_232507/ # Timestamped experiment folder
├── reaching_cfg.py # Config snapshot (copied from metasim/cfg/tasks/skillblender)
├── model_0.pt # Checkpoint at iteration 0
├── model_500.pt # Checkpoint at iteration 500
└── ...
```
when play
## Task list
> 4 Goal-Conditional Skills
- [x] Walking
- [x] Squatting
- [x] Stepping
- [x] Reaching
> 8 Loco-Manipulation Tasks
- [x] FarReach
- [x] ButtonPress
- [x] CabinetClose
- [x] FootballShoot
- [x] BoxPush
- [x] PackageLift
- [x] BoxTransfer
- [x] PackageCarry
## Robots supports
- [x] h1
- [x] g1
- [ ] h1_2
## Todos
- [x] domain randomization
- [x] pushing robot
- [ ] sim2sim
## How to add new Task
1. **Create your wrapper module**
- Add a new file `abc_wrapper.py` under `roboverse_learn/skillblender_rl/env_wrappers`
- Add a config file `abc_cfg.py` under `metasim/cfg/tasks/skillblender`
- define your reward functions in reward_fun_cfg.py, check whether the current states or variables are enough for reward computation.
2. If states not enough, add global variable by overriding `_init_buffer()`
```
def _init_buffers(self):
super()._init_buffers()
"""DEFINED YOUR VARIABLE or BUFFER HERE"""
self.xxx = xxx
```
3. parse your state for reward computation if necessary:
```
def _parse_NEW_STATES(self, envstate):
"""NEWSTATES PARSEING..."""
envstate[robot_name].extra{'xxx'} = self.xxx
def _parse_state_for_reward(self, envstate):
super()._parse_state_for_reward(self, envstate):
_parse_NEW_STATES(self, envstate)
```
3. Implemented `_compute_observation()`
- fill `obs` and `privelidged_obs`.
- modified `_post_physics_step` to reset variables you defined with `reset_env_idx`
3. Add Cfg for your task `metasim/cfg/tasks/skillblender`
## References and Acknowledgements
We implement SkillBlender based on and inspired by the following projects:
- [SkillBlender](https://github.com/Humanoid-SkillBlender/SkillBlender)
- [Legged_gym](https://github.com/leggedrobotics/legged_gym)
- [rsl_rl](https://github.com/leggedrobotics/rsl_rl)
- [HumanoidVerse](https://github.com/LeCAR-Lab/HumanoidVerse/tree/master)