mishgon / alphastrassenLinks
Reproduction of AlphaTensor paper for 2x2 matrices
☆17Updated last year
Alternatives and similar repositories for alphastrassen
Users that are interested in alphastrassen are comparing it to the libraries listed below
Sorting:
- A distributed GPU-centric experience replay system for large AI models.☆18Updated last year
- QuaRL is an open-source framework for systematically studying the effect of applying quantization to reinforcement learning algorithms.☆68Updated 2 years ago
- A novel parallel UCT algorithm with linear speedup and negligible performance loss.☆119Updated 4 years ago
- ☆29Updated 2 years ago
- A Really Scalable RL Framework to 10k+ CPUs☆33Updated last year
- ICLR 2021: "Monte-Carlo Planning and Learning with Language Action Value Estimates"☆33Updated last year
- ☆18Updated 2 years ago
- Comprehensive Implementation of Proximal Policy Optimization☆10Updated 3 years ago
- Parallel Monte Carlo Tree Search, see README.md for more detailed usage and information.☆46Updated 4 years ago
- ☆24Updated 2 years ago
- Demonstrating the usage of FGYM: A Toolkit for benchmarking FPGA-accelerated Reinforcement Learning☆13Updated 3 years ago
- MineRL DDPG Agent to Obtain Diamond in Minecraft☆14Updated 5 years ago
- ☆32Updated 9 months ago
- ☆41Updated 3 years ago
- Generalized Proximal Policy Optimization with Sample Reuse (GePPO)☆24Updated last year
- Code of the paper: Debiasing Meta-Gradient Reinforcement Learning by Learning the Outer Value Function☆13Updated 2 years ago
- Efficient Exploration through Bayesian Deep-Q Networks.☆18Updated 3 years ago
- Code for "Offline Meta-Reinforcement Learning with Advantage Weighting" [ICML 2021]☆47Updated 2 years ago
- Official implementation of NeurIPS22 paper “Multi-agent Dynamic Algorithm Configuration”☆25Updated 2 years ago
- Gated Transformer Model for Computer Vision☆23Updated 3 years ago
- A Deep-Reinforcement-Learning-Based Scheduler for FPGA HLS☆14Updated 4 years ago
- Must-read papers on Reinforcement Learning (RL)☆50Updated 4 years ago
- ☆30Updated 2 years ago
- Codebase for "Uni[MASK]: Unified Inference in Sequential Decision Problems"☆55Updated 11 months ago
- A number of agents (PPO, MuZero) with a Perceiver-based NN architecture that can be trained to achieve goals in nethack/minihack environm…☆41Updated 2 years ago
- Author's implementation of ReBRAC, a minimalist improvement upon TD3+BC☆14Updated last year
- An unofficial implementation for online decision transformer☆40Updated 2 years ago
- An implementation of MuZero in JAX.☆56Updated 2 years ago
- This is the source code of RPG (Reward-Randomized Policy Gradient)☆42Updated 2 years ago
- Minimal implementation of multi-agent reinforcement learning algorithms☆55Updated 3 years ago