RobertTLange / gym-hanoi
A Towers of Hanoi environment in OpenAI Gym Style
β13Updated 5 years ago
Alternatives and similar repositories for gym-hanoi:
Users that are interested in gym-hanoi are comparing it to the libraries listed below
- A curated list of papers presented in the π"Flexible Learning Reading Group" @ TU Berlin. Join us! π€β27Updated 4 years ago
- Baselines for gymnax π€β66Updated last year
- This repository contains code for the method and experiments of the paper "Learning with AMIGo: Adversarially Motivated Intrinsic Goals".β61Updated last year
- Progress, Notes, Summaries and a lot of Questions on Machine Learningβ55Updated 5 years ago
- Continual Reinforcement Learning in 3D Non-stationary Environmentsβ37Updated 5 years ago
- **Sferes2 module** A unifying modular framework for Quality-Diversity algorithmsβ22Updated 4 years ago
- β53Updated 4 months ago
- β85Updated 8 months ago
- β37Updated 8 months ago
- β56Updated 2 years ago
- Supplementary Data for Evolving Reinforcement Learning Algorithmsβ46Updated 4 years ago
- TorchingUp provides minimal implementations of common Reinforcement Learning algorithms written in PyTorch. It is meant to complement Opeβ¦β47Updated 2 years ago
- JAX code for the paper "Control-Oriented Model-Based Reinforcement Learning with Implicit Differentiation"β43Updated 3 years ago
- A Jax/Stax implementation of the general meta learning paper: Oh, J., Hessel, M., Czarnecki, W.M., Xu, Z., van Hasselt, H.P., Singh, S. aβ¦β21Updated 4 years ago
- β31Updated 6 years ago
- Generalised UDRLβ37Updated 2 years ago
- Official data and code for our paper Systematic Evaluation of Causal Discovery in Visual Model Based Reinforcement Learningβ48Updated 3 years ago
- Using RLLib and PycoLab to explore intelligent cooperative behavior in sequential social dilemmasβ50Updated 2 years ago
- Accompanying code for "Learning and Planning in Average-Reward Markov Decision Processes"β14Updated 4 years ago
- Invariant Causal Prediction for Block MDPsβ44Updated 4 years ago
- On the pitfalls of measuring emergent communicationβ34Updated 6 years ago
- β28Updated 2 years ago
- Tutorials on learning and using successor representations.β52Updated 5 years ago
- JAX implementations of core Deep RL algorithmsβ79Updated 2 years ago
- krazy grid worldβ25Updated 5 years ago
- PyTorch Package For Quasimetric Learningβ41Updated 4 months ago
- This code implements Prioritized Level Replay, a method for sampling training levels for reinforcement learning agents that exploits the β¦β83Updated 3 years ago
- A collection of meta-learning algorithms in Jaxβ22Updated 2 years ago
- Jax implementation of Proximal Policy Optimization (PPO) specifically tuned for Procgen, with benchmarked results and saved model weightsβ¦β53Updated 2 years ago
- using information theory to encourage agents to cooperate and competeβ19Updated 6 years ago