understanding-search / maze-dataset
maze datasets for investigating OOD behavior of ML systems
☆25Updated this week
Alternatives and similar repositories for maze-dataset:
Users that are interested in maze-dataset are comparing it to the libraries listed below
- ☆79Updated 7 months ago
- Implements the Messenger environment and EMMA model.☆23Updated last year
- This repo is built to facilitate the training and analysis of autoregressive transformers on maze-solving tasks.☆26Updated 5 months ago
- ☆80Updated 6 months ago
- ☆51Updated 8 months ago
- ☆20Updated 2 years ago
- Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"☆121Updated 3 months ago
- This is code for most of the experiments in the paper Understanding the Effects of RLHF on LLM Generalisation and Diversity☆40Updated last year
- A Concept-Centric Framework for Intelligent Agents☆12Updated last month
- ☆53Updated 3 months ago
- A curated paper list on neural symbolic and probabilistic logic.☆120Updated last year
- PyTorch Package For Quasimetric Learning☆41Updated 3 months ago
- Learning to Modulate pre-trained Models in RL (Decision Transformer, LoRA, Fine-tuning)☆54Updated 4 months ago
- The official repository for our paper "Are Neural Nets Modular? Inspecting Functional Modularity Through Differentiable Weight Masks". We…☆46Updated last year
- ☆12Updated 3 years ago
- Interpreting how transformers simulate agents performing RL tasks☆77Updated last year
- ☆28Updated 2 months ago
- ☆21Updated 4 months ago
- Phy-Q: A Testbed for Physical Reasoning☆43Updated 6 months ago
- Code for LaMPP: Language Models as Probabilistic Priors for Perception and Action☆35Updated last year
- Advantage Leftover Lunch Reinforcement Learning (A-LoL RL): Improving Language Models with Advantage-based Offline Policy Gradients☆26Updated 5 months ago
- ☆19Updated 3 years ago
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆115Updated 5 months ago
- Rewarded soups official implementation☆55Updated last year
- Official code for "Can Wikipedia Help Offline Reinforcement Learning?" by Machel Reid, Yutaro Yamada and Shixiang Shane Gu☆102Updated 2 years ago
- Code and data for the paper "Understanding Hidden Context in Preference Learning: Consequences for RLHF"☆28Updated last year
- Official codebase for "The Generalization Gap in Offline Reinforcement Learning" accepted to ICLR 2024☆28Updated 6 months ago
- A library for efficient patching and automatic circuit discovery.☆53Updated this week
- ☆26Updated last year
- a simple and scalable agent for training adaptive policies with sequence-based RL☆111Updated this week