understanding-search / maze-datasetLinks
maze datasets for investigating OOD behavior of ML systems
☆67Updated last month
Alternatives and similar repositories for maze-dataset
Users that are interested in maze-dataset are comparing it to the libraries listed below
Sorting:
- Rewarded soups official implementation☆62Updated 2 years ago
- Code for Contrastive Preference Learning (CPL)☆177Updated last year
- ☆108Updated last year
- This is code for most of the experiments in the paper Understanding the Effects of RLHF on LLM Generalisation and Diversity☆47Updated last year
- Paper collections of the continuous effort start from World Models.☆191Updated last year
- ☆112Updated 10 months ago
- Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"☆184Updated 6 months ago
- ☆79Updated last year
- ☆133Updated last year
- ☆54Updated last year
- This repo is built to facilitate the training and analysis of autoregressive transformers on maze-solving tasks.☆32Updated last month
- ☆65Updated 9 months ago
- Code for the ICML 2024 paper "Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment"☆78Updated 6 months ago
- Function Vectors in Large Language Models (ICLR 2024)☆187Updated 8 months ago
- Reinforcement Learning via Regressing Relative Rewards☆38Updated last year
- Benchmarking Agentic LLM and VLM Reasoning On Games☆217Updated 2 weeks ago
- Code for "Reasoning to Learn from Latent Thoughts"☆123Updated 8 months ago
- A library for efficient patching and automatic circuit discovery.☆82Updated 4 months ago
- ☆185Updated last year
- This repo contains code for our NeurIPS 2023 spotlight paper: Evaluating and Inducing Personality in Pre-trained Language Models☆56Updated 2 years ago
- ☆76Updated last year
- Code for "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining"☆26Updated 2 months ago
- ☆35Updated 9 months ago
- ☆133Updated last year
- Official pytorch implementation of "Interpreting the Second-Order Effects of Neurons in CLIP"☆42Updated last year
- Bootstrapping ARC☆153Updated last year
- Research Code for "ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL"☆198Updated 8 months ago
- ☆52Updated 8 months ago
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆125Updated last year
- A brief and partial summary of RLHF algorithms.☆139Updated 9 months ago