AlignmentResearch / learned-plannerLinks
Interpreting Learned Search and Planning: Reverse-engineering recurrent convolutional networks (DRC) that play Sokoban
☆17Updated 7 months ago
Alternatives and similar repositories for learned-planner
Users that are interested in learned-planner are comparing it to the libraries listed below
Sorting:
- Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models☆66Updated 11 months ago
- A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks☆36Updated last year
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Updated 2 years ago
- Measuring the situational awareness of language models☆40Updated last year
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆35Updated 9 months ago
- Repository for the paper Stream of Search: Learning to Search in Language☆153Updated last year
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆61Updated last year
- ☆56Updated last year
- Intrinsic Motivation from Artificial Intelligence Feedback☆134Updated 2 years ago
- ☆123Updated 11 months ago
- Minimum Description Length probing for neural network representations☆20Updated last year
- [ICML 2024] Official code release accompanying the paper "diff History for Neural Language Agents" (Piterbarg, Pinto, Fergus)☆20Updated last year
- Code and data for the paper "Why think step by step? Reasoning emerges from the locality of experience"☆62Updated 10 months ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆40Updated last year
- Learning from preferences is a common paradigm for fine-tuning language models. Yet, many algorithmic design decisions come into play. Ou…☆32Updated last year
- Sparse Autoencoder Training Library☆56Updated 9 months ago
- ⚓️ Interactive playground for the "Thought Anchors: Which LLM Reasoning Steps Matter?" paper.☆17Updated last month
- Clean RL implementation using MLX☆34Updated last year
- A Gymnasium-based Environment of the Abstraction and Reasoning Corpus (ARC)☆69Updated last year
- OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code (ICLR 2025).☆73Updated last year
- Fluid Language Model Benchmarking☆26Updated 4 months ago
- Automated Capability Discovery via Foundation Model Self-Exploration☆65Updated 11 months ago
- Simple GRPO scripts and configurations.☆59Updated last year
- [ICLR 2025] Code for the paper "Implicit Search via Discrete Diffusion: A Study on Chess"☆36Updated 11 months ago
- ☆39Updated 9 months ago
- Repo to reproduce the First-Explore paper results