v0.4.5

PaParaZz1 released this 13 Dec 17:40

· 234 commits to main since this release

API Change

Move default examples about adding new env from extending BaseEnv to utilize DingEnvWrapper
rename final_eval_reward to eval_episode_return in all related codes (including envs and evaluators)

Env

add beergame supply chain optimization env (#512)
add env gym_pybullet_drones (#526)
rename eval reward to episode return (#536)

Algorithm

add policy gradient algo implementation (#544)
add MADDPG algo implementation (#550)
add IMPALA continuous algo implementation (#551)
add MADQN algo implementation (#540)

Enhancement

add new task IMPALA-type distributed training scheme (#321)
add load and save method for replaybuffer (#542)
add more DingEnvWrapper example (#525)
add evaluator more info viz support (#538)
add trackback log for subprocess env manager (#534)

Fix

fix halfcheetah td3 config file (#537）
fix mujoco action_clip args compatibility bug (#535)
fix atari a2c config entry bug
fix drex unittest compatibility bug

Style

add Roadmap issue of DI-engine (#548)
update related project link and new env doc

New Project

PPOxFamily: PPO x Family DRL Tutorial Course
ACE: [AAAI 2023] Official PyTorch implementation of paper "ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency".

Contributors: @PaParaZz1 @sailxjx @zjowowen @hiha3456 @Weiyuhong-1998 @kxzxvbk @song2181 @zerlinwang

Contributors

sailxjx, PaParaZz1, and 6 other contributors

Assets 2