Skip to content

v0.4.5

Compare
Choose a tag to compare
@PaParaZz1 PaParaZz1 released this 13 Dec 17:40
· 234 commits to main since this release

API Change

  1. Move default examples about adding new env from extending BaseEnv to utilize DingEnvWrapper
  2. rename final_eval_reward to eval_episode_return in all related codes (including envs and evaluators)

Env

  1. add beergame supply chain optimization env (#512)
  2. add env gym_pybullet_drones (#526)
  3. rename eval reward to episode return (#536)

Algorithm

  1. add policy gradient algo implementation (#544)
  2. add MADDPG algo implementation (#550)
  3. add IMPALA continuous algo implementation (#551)
  4. add MADQN algo implementation (#540)

Enhancement

  1. add new task IMPALA-type distributed training scheme (#321)
  2. add load and save method for replaybuffer (#542)
  3. add more DingEnvWrapper example (#525)
  4. add evaluator more info viz support (#538)
  5. add trackback log for subprocess env manager (#534)

Fix

  1. fix halfcheetah td3 config file (#537
  2. fix mujoco action_clip args compatibility bug (#535)
  3. fix atari a2c config entry bug
  4. fix drex unittest compatibility bug

Style

  1. add Roadmap issue of DI-engine (#548)
  2. update related project link and new env doc

New Project

  1. PPOxFamily: PPO x Family DRL Tutorial Course
  2. ACE: [AAAI 2023] Official PyTorch implementation of paper "ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency".

Contributors: @PaParaZz1 @sailxjx @zjowowen @hiha3456 @Weiyuhong-1998 @kxzxvbk @song2181 @zerlinwang