kvfrans / lmpoLinks
☆83Updated last week
Alternatives and similar repositories for lmpo
Users that are interested in lmpo are comparing it to the libraries listed below
Sorting:
- Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models☆60Updated 4 months ago
- Minimal but scalable implementation of large language models in JAX☆35Updated last week
- Learn online intrinsic rewards from LLM feedback☆41Updated 6 months ago
- Code and Configs for Asynchronous RLHF: Faster and More Efficient RL for Language Models☆59Updated 2 months ago
- Latent Program Network (from the "Searching Latent Program Spaces" paper)☆91Updated 4 months ago
- ☆90Updated this week
- Benchmarking Agentic LLM and VLM Reasoning On Games☆166Updated 2 months ago
- Simple repository for training small reasoning models☆33Updated 5 months ago
- Cost aware hyperparameter tuning algorithm☆162Updated last year
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆147Updated 2 weeks ago
- OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code (ICLR 2025).☆60Updated 6 months ago
- Efficiently discovering algorithms via LLMs with evolutionary search and reinforcement learning.☆103Updated this week
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆87Updated 2 weeks ago
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆129Updated last year
- Library for text-to-text regression, applicable to any input string representation and allows pretraining and fine-tuning over multiple r…☆86Updated this week
- XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning - - — ICLR 2025☆75Updated 5 months ago
- ☆98Updated last year
- Efficient World Models with Context-Aware Tokenization. ICML 2024☆105Updated 9 months ago
- Implementation of the new SOTA for model based RL, from the paper "Improving Transformer World Models for Data-Efficient RL", in Pytorch☆128Updated 2 months ago
- Dataset and benchmark for assessing LLMs in translating natural language descriptions of planning problems into PDDL☆56Updated 9 months ago
- ☆127Updated last year
- A scalable asynchronous reinforcement learning implementation with in-flight weight updates.☆129Updated this week
- ☆45Updated last year
- Official repository of the spotlight ICML 2025 paper, PokeChamp: an Expert-level Minimax Language Agent.☆69Updated this week
- The official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)☆35Updated last week
- SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning☆103Updated last week
- Official Repo for InSTA: Towards Internet-Scale Training For Agents☆48Updated last week
- Intrinsic Motivation from Artificial Intelligence Feedback☆129Updated last year
- Archon provides a modular framework for combining different inference-time techniques and LMs with just a JSON config file.☆173Updated 4 months ago
- This code accompanies the paper "Leveraging Skills from Unlabeled Prior Data for Efficient Online Exploration."☆28Updated this week