maksimKorzh / tictactoe-mtcsView external linksLinks
☆18Nov 10, 2020Updated 5 years ago
Alternatives and similar repositories for tictactoe-mtcs
Users that are interested in tictactoe-mtcs are comparing it to the libraries listed below
Sorting:
- Lecture notes for a course on Decision and Game Theory for undergraduates studying AI☆13Dec 14, 2018Updated 7 years ago
- GAIL learning to imitate PPO playing CartPole.☆12May 27, 2021Updated 4 years ago
- Actor-Sharer-Learner training framework for off-policy DRL algorithms☆22Dec 29, 2024Updated last year
- Results reproductions & comparisons between OpenSpiel implementations, associated paper & originating works☆18Mar 2, 2021Updated 4 years ago
- CFR implementation of a poker bot.☆12Feb 17, 2023Updated 2 years ago
- ☆21Dec 22, 2020Updated 5 years ago
- Implementations of a large collection of reinforcement learning algorithms.☆28Nov 30, 2023Updated 2 years ago
- Tutorial: Writing R and Python Packages with Multithreaded C++ Code using BLAS, AVX2/AVX512, OpenMP, C++11 Threads and Cuda GPU accelerat…☆13Nov 27, 2022Updated 3 years ago
- A collection of different PyTorch wrappers for training neural networks and reinforcement algorithms☆13Dec 15, 2022Updated 3 years ago
- Codebase for the paper "How Crucial is Transformer in Decision Transformer?". Containing experiments on different pendulum tasks and code…☆28Mar 24, 2023Updated 2 years ago
- ☆13Dec 13, 2024Updated last year
- 基于Dijkstra算法的武汉地铁路径规划☆10Jul 1, 2022Updated 3 years ago
- Some microbenchmarks and design docs before commencement☆12Feb 1, 2021Updated 5 years ago
- 基于RLCard平台的麻将mahjong博弈游戏代码,包括基于规则和基于Dueling DQN的Agent模型。☆32Apr 25, 2022Updated 3 years ago
- 🚀全流程自己训练一个VLA 「大模型」1小时从0训练26M参数的视觉多模态VLM!🌏 Train a 26M-parameter VLM from scratch in just 1 hours!☆26Oct 16, 2025Updated 4 months ago
- ☆10Oct 11, 2022Updated 3 years ago
- A QA system based on k8s-specific knowledge build on ChatGLM2-6B, serving by Ray.☆10Sep 14, 2023Updated 2 years ago
- Gym environment for playing Wordle with RL agents☆42Feb 8, 2022Updated 4 years ago
- Swarm learning algorithm☆11Jun 2, 2021Updated 4 years ago
- 2019 Fall - Game theory and Multi-agent RL Termproject☆11Dec 13, 2019Updated 6 years ago
- nd009-cn-advanced-p5,针对Udacity CN MLND P5项目☆14Jun 27, 2022Updated 3 years ago
- 一个开源数学大模型项目,旨在探索大模型是否具有数学创造能力,以及大模型在前沿数学研究中的潜在能力。☆17May 16, 2025Updated 9 months ago
- 使用强化学习算法Q-learning,对3D打印的路径进行规划,减少打印喷头转弯、启停,提高打印效率。☆12Jun 30, 2021Updated 4 years ago
- Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow - Tensorlfow Im…☆13Feb 2, 2019Updated 7 years ago
- An implementation of the AlphaZero algorithm for adversarial games to be used with the machine learning framework of your choice☆12Aug 30, 2020Updated 5 years ago
- Robust Reinforcement Learning Benchmark☆12Sep 22, 2024Updated last year
- MLflow App Using React, Hooks, RabbitMQ, FastAPI Server, Celery, Microservices☆11Sep 25, 2022Updated 3 years ago
- ☆12Mar 6, 2023Updated 2 years ago
- A sample library for using Sphinx to generate a document.☆10May 24, 2025Updated 8 months ago
- Poker hand evaluation for Go☆12Feb 7, 2014Updated 12 years ago
- Gym wrapper for pysc2☆10Sep 16, 2022Updated 3 years ago
- CFR-based Texas Hold'em AI☆11Jan 30, 2021Updated 5 years ago
- 3rd placed submission to the NeurIPS MineRL competition 2019☆10Mar 24, 2023Updated 2 years ago
- Official implementation of Recurrent Action Transformer with Memory, an offline RL agent with memory mechanisms. https://sites.google.com…☆18Nov 23, 2025Updated 2 months ago
- ☆11Apr 26, 2025Updated 9 months ago
- PPO with multi-head/autoregressive action outputs☆45Mar 4, 2021Updated 4 years ago
- python code for retrieving data on Pinnacle Sports site and loading to db☆11May 25, 2024Updated last year
- Simple PyTorch profiler that combines DeepSpeed Flops Profiler and TorchInfo☆11Feb 12, 2023Updated 3 years ago
- Implementation of elo rating for large competitions☆10Nov 25, 2016Updated 9 years ago