Skip to content

Latest commit

 

History

History
135 lines (79 loc) · 4.93 KB

README.md

File metadata and controls

135 lines (79 loc) · 4.93 KB

Alpha-Omok

This is a project of Reinforcement Learning KR group.

AlphaZero is a Reinforcement Learning algorithm which is effectively combine MCTS(Monte-Carlo Tree Search) with Actor-Critic. Alpha-Omok team wanted to apply AlphaZero algorithm to famous board game Omok (Gomoku). Omok is a traditional game, which uses same gameboard with Go. Therefore we thought that it is proper game to apply AlphaZero algorithm. For now, the algorithm is implemented by Pytorch. Tensorflow version will be release soon!!

All the environments are implemented by pygame, so you should install pygame to run the codes in this repository!!


Training Result

Play Demo (Agent win) Play Demo (Agent win)

Project objective

There are 4 objectives to achieve in this project

  1. MCTS on Tic-Tac-Toe
  2. MCTS on Omok
  3. AlphaZero on Omok
  4. Upload AlphaZero on web

Documents


Description of the Folders

1_tictactoe_MCTS

TicTacToe Image

This folder is for implementing MCTS in Tic-Tac-Toe. If you want to study MCTS only, please check the files in this folder.

The description of the files in the folder is as follows. (files with bold text are codes for implementation)

  • env: Tic-Tac-Toe environment code (made with pygame)
  • mcts_guide: MCTS doesn't play the game, it only recommends how to play.
  • mcts_vs: User can play against MCTS algorithm.
  • utils: functions for implementing algorithm.

2_AlphaOmok

mini omok Image

The folder is for implementing AlphaZero algorithm in omok environment. There are two versions of omok (env_small: 9x9, env_regular: 15x15). The above image is sample image of 9x9 omok game

The description of the files in the folder is as follows. (files with bold text are codes for implementation)

  • eval_main: code for evaluating the algorithm on both local PC and web
  • main: main training code of Alpha Zero
  • model: Network model (PyTorch)
  • agents: Agent and MCTS algorithm
  • utils: functions for implementing algorithm
  • WebAPI: Implementation of web API

Sample Image of Web Demo

simple board example


Future Work

  • Apply parallel computation to improve computation speed
  • Make Tensorflow version of the code
  • Train the agent to solve 15x15 size Omok game
  • Apply Renju Rule

Reference

  1. Mastering the Game of Go with Deep Neural Networks and Tree Search
  2. Mastering the Game of Go without Human Knowledge

AlphaOmok Team

mini omok team

Kyushik Min

Jungdae Kim

Taeyoung Kim

Woongwon Lee