AlignmentResearch / learned-plannerLinks
Interpreting Learned Search and Planning: Reverse-engineering recurrent convolutional networks (DRC) that play Sokoban
☆15Updated 5 months ago
Alternatives and similar repositories for learned-planner
Users that are interested in learned-planner are comparing it to the libraries listed below
Sorting:
- Measuring the situational awareness of language models☆39Updated last year
- Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models☆65Updated 9 months ago
- Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format☆27Updated 2 years ago
- ☆55Updated last year
- Code repo for MathAgent☆17Updated 2 years ago
- Repository for the paper Stream of Search: Learning to Search in Language☆152Updated 10 months ago
- ☆22Updated 2 years ago
- ☆125Updated 9 months ago
- ⚓️ Interactive playground for the "Thought Anchors: Which LLM Reasoning Steps Matter?" paper.☆17Updated 4 months ago
- Implementation of SelfExtend from the paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning" from Pytorch and Zeta☆13Updated last year
- Minimum Description Length probing for neural network representations☆20Updated 10 months ago
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆34Updated 8 months ago
- Systematic evaluation framework that automatically rates overthinking behavior in large language models.☆94Updated 7 months ago
- ☆55Updated last year
- LILO: Library Induction with Language Observations☆88Updated last year
- Can Language Models Solve Olympiad Programming?☆123Updated 11 months ago
- Learning to Retrieve by Trying - Source code for Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval☆51Updated last year
- A benchmark for evaluating learning agents based on just language feedback☆92Updated 6 months ago
- EMNLP 2024 "Re-reading improves reasoning in large language models". Simply repeating the question to get bidirectional understanding for…☆27Updated last year
- CodeUltraFeedback: aligning large language models to coding preferences (TOSEM 2025)☆73Updated last year
- ☆28Updated 8 months ago
- Memoria is a human-inspired memory architecture for neural networks.☆79Updated last year
- A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks☆36Updated last year
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆66Updated last year
- Open-source Human Feedback Library☆11Updated 2 years ago
- ☆44Updated 5 months ago
- Minimal implementation of the Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models paper (ArXiv 20232401.01335)☆29Updated last year
- An open source replication of the stawberry method that leverages Monte Carlo Search with PPO and or DPO☆29Updated last week
- ☆29Updated last year
- ☆13Updated 2 weeks ago