SuperSuit

SuperSuit makes common environment preprocessing very easy to use for OpenAI Gym and PettingZoo environments. Interesting use cases include applying the large set of wrappers needed for learning Atari to both Gym and PettingZoo Atari environments without any code changes. This library has been popular with users new to RL and has been an essential tool internally to the PettingZoo team.

My contribution involved taking a poorly designed god-class for observation transformation, and turn it into its current modular form, implementing all the necessary code, tests, and helping write documentation.

I also designed and implemented the vector environment transformation in SuperSuit documented here which allows multi-agent environments to be trained with Stable Baselines, a simple and popular RL framework. This trick (and other supersuit wrappers) powers the popular towards data science PettingZoo tutorial. The code for this article is shown below:

from stable_baselines3 import PPO
from pettingzoo.butterfly import pistonball_v4
import supersuit as ss

env = pistonball_v4.parallel_env(n_pistons=20, local_ratio=0, time_penalty=-0.1, continuous=True, random_drop=True, random_rotate=True, ball_mass=0.75, ball_friction=0.3, ball_elasticity=1.5, max_cycles=125)
# basic usage of supersuit. Transforms observation of environment
# to make it fit the requirements of stable baselines
# and make it easier to learn. See tutorial for more details
env = ss.color_reduction_v0(env, mode='B')
env = ss.resize_v0(env, x_size=84, y_size=84)
env = ss.frame_stack_v1(env, 3)
# multi-agent vectorization trick
env = ss.pettingzoo_env_to_vec_env_v0(env)
# multiprocessing
env = ss.concat_vec_envs_v0(env, 8, num_cpus=4, base_class='stable_baselines3')

model = PPO("CnnPolicy", env, verbose=3, gamma=0.99, n_steps=125, ent_coef=0.01, learning_rate=0.00025, vf_coef=0.5, max_grad_norm=0.5, gae_lambda=0.95, n_epochs=4, clip_range=0.2, clip_range_vf=1)
model.learn(total_timesteps=2000000)
model.save("policy")