AlignmentResearch / go_attackLinks
☆89Updated 9 months ago
Alternatives and similar repositories for go_attack
Users that are interested in go_attack are comparing it to the libraries listed below
Sorting:
- [IEEE ToG] MiniZero: An AlphaZero and MuZero Training Framework☆106Updated 2 months ago
- Supplementary Data for Evolving Reinforcement Learning Algorithms☆46Updated 4 years ago
- AlphaZero in JAX☆78Updated last year
- Intrinsic Motivation from Artificial Intelligence Feedback☆131Updated last year
- Scaling scaling laws with board games.☆53Updated 2 years ago
- (NeurIPS 2023) ChessGPT - Bridging Policy Learning and Language Modeling☆128Updated last year
- A novel parallel UCT algorithm with linear speedup and negligible performance loss.☆122Updated 4 years ago
- ☆52Updated 2 years ago
- fast + parallel AlphaZero in JAX☆103Updated 9 months ago
- Library for running a Monte Carlo tree search, either traditionally or with expert policies☆126Updated last year
- Genetic programming using LLMs☆48Updated 7 months ago
- Efficient baselines for autocurricula in JAX.☆196Updated last year
- ☆23Updated last year
- MACTA: A Multi-agent Reinforcement Learning Approach for Cache Timing Attacks and Detection☆46Updated 2 years ago
- ☆15Updated last year
- ☆57Updated last year
- Solving the Rubik's cube with deep reinforcement learning and Monte Carlo tree search☆104Updated 6 years ago
- Code for the paper "Understanding RL Vision"☆50Updated 2 years ago
- ☆31Updated 3 years ago
- A clean implementation of MuZero and AlphaZero following the AlphaZero General framework. Train and Pit both algorithms against each othe…☆162Updated 4 years ago
- Pytorch Implementation of MuZero Unplugged for gym environment. This algorithm is capable of supporting a wide range of action and observ…☆33Updated 3 months ago
- A fast, generalized, and modified implementation of Deepmind's distinguished AlphaZero in PyTorch.☆80Updated 10 months ago
- A number of agents (PPO, MuZero) with a Perceiver-based NN architecture that can be trained to achieve goals in nethack/minihack environm…☆41Updated 3 years ago
- CleanRL's implementation of DeepMind's Podracer Sebulba Architecture for Distributed DRL☆116Updated last year
- Learn online intrinsic rewards from LLM feedback☆43Updated 10 months ago
- An environment for learning formal mathematical reasoning from scratch☆71Updated last year
- An implementation of MuZero in JAX.☆57Updated 2 years ago
- ☆193Updated 2 years ago
- Meta-Learning for Compositionality (MLC) for modeling human behavior☆143Updated last year
- Official repository of the spotlight ICML 2025 paper, PokeChamp: an Expert-level Minimax Language Agent.☆109Updated 3 weeks ago