understanding-search / maze-transformerLinks
This repo is built to facilitate the training and analysis of autoregressive transformers on maze-solving tasks.
☆32Updated this week
Alternatives and similar repositories for maze-transformer
Users that are interested in maze-transformer are comparing it to the libraries listed below
Sorting:
- Interpreting how transformers simulate agents performing RL tasks☆88Updated 2 years ago
- A TinyStories LM with SAEs and transcoders☆13Updated 6 months ago
- ☆14Updated last year
- ☆39Updated last month
- Latent Program Network (from the "Searching Latent Program Spaces" paper)☆102Updated last month
- An Open-Ended Agentic Simulator☆52Updated last year
- unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"☆79Updated 3 years ago
- Benchmarking RL for POMDPs in Pure JAX [Code for "Structured State Space Models for In-Context Reinforcement Learning" (NeurIPS 2023)]☆110Updated last year
- see github.com/understanding-search/maze-transformer☆10Updated last year
- ☆53Updated last year
- Official codebase for "Sampling For Learnability", published at NeurIPS 2024☆18Updated last week
- OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code (ICLR 2025).☆69Updated 10 months ago
- Simple JAX Graphics Library.☆36Updated 11 months ago
- Code for the paper "Learning Temporal Distances: Contrastive Successor Features Can Provide a Metric Structure for Decision-Making"☆28Updated last year
- Scaling scaling laws with board games.☆53Updated 2 years ago
- Official Implementation of `An Optimisation Framework for Unsupervised Environment Design` from RLC 2025☆17Updated 2 months ago
- TPU pod commander is a package for managing and launching jobs on Google Cloud TPU pods.☆21Updated last month
- maze datasets for investigating OOD behavior of ML systems☆64Updated last week
- ☆73Updated last year
- Scalable Opponent Shaping Experiments in JAX☆24Updated last year
- Mitigating Partial Observability in Sequential Decision Processes via the Lambda Discrepancy☆21Updated last year
- Code for "Unsupervised Zero-Shot RL via Functional Reward Representations"☆57Updated last year
- Code for Discovered Policy Optimisation (NeurIPS 2022)☆12Updated 2 years ago
- Code for Powderworld: A Platform for Understanding Generalization via Rich Task Distributions☆69Updated last year
- Official codebase for "The Generalization Gap in Offline Reinforcement Learning" accepted to ICLR 2024☆28Updated last year
- ☆56Updated 11 months ago
- Comparison between GFlowNets & Maximum Entropy RL☆19Updated last year
- ☆27Updated 2 years ago
- Sparse Autoencoder Training Library☆55Updated 6 months ago
- VC-FB and MC-FB algorithms from "Zero-Shot Reinforcement Learning from Low Quality Data" (NeurIPS 2024)☆20Updated 9 months ago