yata0 / Mahjong
☆9Updated 2 years ago
Alternatives and similar repositories for Mahjong:
Users that are interested in Mahjong are comparing it to the libraries listed below
- ☆20Updated 2 years ago
- An asynchronous/parallel method of AlphaGo Zero algorithm with Gomoku☆196Updated 5 years ago
- varitional oracle guiding for reinforcement learning☆11Updated 2 years ago
- C++/python fight the lord with pybind11 (强化学习AI斗地主), Accepted to AIIDE-2020☆159Updated 3 years ago
- Scalable Implementation of Neural Fictitous Self-Play☆75Updated 6 years ago
- A novel parallel UCT algorithm with linear speedup and negligible performance loss.☆115Updated 3 years ago
- ☆40Updated 2 years ago
- ☆12Updated 2 years ago
- advantage actor-critic reinforcement learning for openai gym cartpole☆63Updated 7 years ago
- Chinese Standard Mahjong Competition hosted by AILab in Peking University.☆97Updated 2 years ago
- An unoffical implementation of AlphaHoldem. 1v1 nl-holdem AI.☆80Updated last year
- A tiny re-implementation of AlphaGo Zero (in Gomoku)☆73Updated 6 years ago
- Counterfactual regret minimization algorithm for Kuhn poker☆169Updated 6 years ago
- Keeping track of RL experiments☆160Updated 2 years ago
- This code is based on the implementation of http://www.cs.cmu.edu/afs/cs/Web/People/sandholm/potential-aware_imperfect-recall.aaai14.pdf,…☆34Updated 6 years ago
- ☆47Updated last year
- ☆97Updated 4 years ago
- Source code for the paper "Divergence-Augmented Policy Optimization"☆37Updated 5 years ago
- Python Fan calculator for Chinese Standard Mahjong☆17Updated 3 weeks ago
- This is an pytorch implementation of Distributed Proximal Policy Optimization(DPPO).☆62Updated 6 years ago
- mcc_second_guandan☆72Updated 2 years ago
- Translation and understanding of the Pop-art paper.☆17Updated 5 years ago
- Pytorch implementation of Distributed Proximal Policy Optimization: https://arxiv.org/abs/1707.02286☆181Updated 6 years ago
- ☆32Updated 4 years ago
- ☆140Updated 2 months ago
- Repo for the Greedy when Sure and Conservative when Uncertain about the Opponents (GSCU)☆19Updated 2 years ago
- This project is implementation code of AlphaStar☆195Updated last year
- Simple implementation of regret matching algorithm for RPS nash equilibrium computation via self-play☆25Updated 6 years ago
- ☆9Updated 2 years ago
- An implement of DQfD(Deep Q-learning from Demonstrations) raised by DeepMind:Learning from Demonstrations for Real World Reinforcement Le…☆132Updated 7 years ago