kvfrans / lmpoLinks
☆122Updated this week
Alternatives and similar repositories for lmpo
Users that are interested in lmpo are comparing it to the libraries listed below
Sorting:
- A simple, performant and scalable JAX-based world modeling codebase☆109Updated 3 weeks ago
- Minimal but scalable implementation of large language models in JAX☆35Updated 2 months ago
- ☆106Updated 3 weeks ago
- Cost aware hyperparameter tuning algorithm☆173Updated last year
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆111Updated last month
- RLP: Reinforcement as a Pretraining Objective☆200Updated last month
- This repo contains the source code for the paper "Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning"☆255Updated last week
- XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning - - — ICLR 2025☆81Updated 9 months ago
- rl from zero pretrain, can it be done? yes.☆280Updated last month
- Latent Program Network (from the "Searching Latent Program Spaces" paper)☆105Updated last month
- Dion optimizer algorithm☆384Updated this week
- Learn online intrinsic rewards from LLM feedback☆44Updated 11 months ago
- Flax (Jax) implementation of DeepSeek-R1-Distill-Qwen-1.5B with weights ported from Hugging Face.☆25Updated 9 months ago
- ☆200Updated 3 months ago
- NanoGPT-speedrunning for the poor T4 enjoyers☆72Updated 6 months ago
- Supporting code for the blog post on modular manifolds.☆102Updated last month
- OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code (ICLR 2025).☆69Updated 10 months ago
- Synchronized Curriculum Learning for RL Agents☆114Updated 2 weeks ago
- Official codebase for "Quantile Reward Policy Optimization: Alignment with Pointwise Regression and Exact Partition Functions" (Matrenok …☆27Updated 2 weeks ago
- Minimal yet performant LLM examples in pure JAX