understanding-search / maze-dataset
maze datasets for investigating OOD behavior of ML systems
☆37Updated this week
Alternatives and similar repositories for maze-dataset:
Users that are interested in maze-dataset are comparing it to the libraries listed below
- ☆84Updated 8 months ago
- ☆31Updated 3 months ago
- This repo is built to facilitate the training and analysis of autoregressive transformers on maze-solving tasks.☆27Updated 7 months ago
- This is code for most of the experiments in the paper Understanding the Effects of RLHF on LLM Generalisation and Diversity☆40Updated last year
- Rewarded soups official implementation☆55Updated last year
- Implements the Messenger environment and EMMA model.☆23Updated last year
- ☆58Updated 9 months ago
- Bootstrapping ARC☆105Updated 4 months ago
- Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"☆148Updated 4 months ago
- ☆81Updated 7 months ago
- ☆90Updated last month
- Official pytorch implementation of "Interpreting the Second-Order Effects of Neurons in CLIP"☆34Updated 4 months ago
- ☆21Updated 6 months ago
- BASALT Benchmark datasets, evaluation code and agent training example.☆20Updated last year
- Interpretable text embeddings by asking LLMs yes/no questions (NeurIPS 2024)☆37Updated 4 months ago
- Official PyTorch Implementation of the Longhorn Deep State Space Model☆50Updated 3 months ago
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"☆153Updated 11 months ago
- A Concept-Centric Framework for Intelligent Agents☆14Updated last week
- Efficient World Models with Context-Aware Tokenization. ICML 2024☆94Updated 6 months ago
- Dataset and benchmark for assessing LLMs in translating natural language descriptions of planning problems into PDDL☆48Updated 5 months ago
- Code for Contrastive Preference Learning (CPL)☆162Updated 4 months ago
- ☆30Updated 2 months ago
- Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models☆54Updated last month
- VC-FB and MC-FB algorithms from "Zero-Shot Reinforcement Learning from Low Quality Data" (NeurIPS 2024)☆13Updated 2 months ago
- ☆34Updated 11 months ago
- Codebase for PRISE: Learning Temporal Action Abstractions as a Sequence Compression Problem☆22Updated 8 months ago
- ☆30Updated last year
- JAX reimplementation of the DeepMind paper "Genie: Generative Interactive Environments"☆57Updated 2 months ago
- [ICLR 2025] Unintentional Unalignment: Likelihood Displacement in Direct Preference Optimization☆23Updated 2 months ago
- ☆37Updated last year