understanding-search / maze-datasetLinks
maze datasets for investigating OOD behavior of ML systems
☆54Updated last month
Alternatives and similar repositories for maze-dataset
Users that are interested in maze-dataset are comparing it to the libraries listed below
Sorting:
- ☆101Updated last year
- This is code for most of the experiments in the paper Understanding the Effects of RLHF on LLM Generalisation and Diversity☆46Updated last year
- Code for Contrastive Preference Learning (CPL)☆175Updated 10 months ago
- Rewarded soups official implementation☆60Updated last year
- ☆106Updated 7 months ago
- Reinforcement Learning via Regressing Relative Rewards☆36Updated 9 months ago
- Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"☆169Updated 4 months ago
- ☆63Updated 6 months ago
- ☆70Updated last year
- Benchmarking Agentic LLM and VLM Reasoning On Games☆193Updated last month
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆123Updated last year
- ☆186Updated last year
- Paper collections of the continuous effort start from World Models.☆184Updated last year
- This is the official repository for the "Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP" paper acce…☆22Updated last year
- Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"☆118Updated last year
- Official pytorch implementation of "Interpreting the Second-Order Effects of Neurons in CLIP"☆39Updated 10 months ago
- Code for "Reasoning to Learn from Latent Thoughts"☆118Updated 5 months ago
- ☆138Updated 2 months ago
- ☆131Updated last year
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"☆190Updated 5 months ago
- SmartPlay is a benchmark for Large Language Models (LLMs). Uses a variety of games to test various important LLM capabilities as agents. …☆140Updated last year
- ☆34Updated 6 months ago
- ☆33Updated 8 months ago
- This repo is built to facilitate the training and analysis of autoregressive transformers on maze-solving tasks.☆31Updated last year
- A library for efficient patching and automatic circuit discovery.☆76Updated 2 months ago
- [NeurIPS 2025] What Makes a Reward Model a Good Teacher? An Optimization Perspective☆35Updated last week
- ☆69Updated 10 months ago
- Code for "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining"☆20Updated 5 months ago
- SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters (ICLR 2025)☆15Updated last month
- Reasoning with Language Model is Planning with World Model☆171Updated 2 years ago