kvfrans / lmpoLinks
☆131Updated 3 weeks ago
Alternatives and similar repositories for lmpo
Users that are interested in lmpo are comparing it to the libraries listed below
Sorting:
- A simple, performant and scalable JAX-based world modeling codebase.☆119Updated 2 months ago
- Minimal but scalable implementation of large language models in JAX☆35Updated last month
- RLP: Reinforcement as a Pretraining Objective☆218Updated 2 months ago
- XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning - - — ICLR 2025☆81Updated 10 months ago
- ☆116Updated 3 weeks ago
- Flax (Jax) implementation of DeepSeek-R1-Distill-Qwen-1.5B with weights ported from Hugging Face.☆26Updated 10 months ago
- Synchronized Curriculum Learning for RL Agents☆116Updated last month
- Benchmarking Agentic LLM and VLM Reasoning On Games☆219Updated 3 weeks ago
- ☆122Updated 6 months ago
- OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code (ICLR 2025).☆72Updated last year
- 📄Small Batch Size Training for Language Models☆69Updated 2 months ago
- Cost aware hyperparameter tuning algorithm☆176Updated last year
- NanoGPT-speedrunning for the poor T4 enjoyers☆73Updated 8 months ago
- Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models☆65Updated 10 months ago
- This repo contains the source code for the paper "Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning"☆279Updated last month
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆122Updated 2 months ago
- Implementation of the new SOTA for model based RL, from the paper "Improving Transformer World Models for Data-Efficient RL", in Pytorch☆148Updated 7 months ago
- Supporting code for the blog post on modular manifolds.☆108Updated 3 months ago
- rl from zero pretrain, can it be done? yes.☆282Updated 3 months ago
- Learn online intrinsic rewards from LLM feedback☆45Updated last year
- Latent Program Network (from the "Searching Latent Program Spaces" paper)☆107Updated last month
- Efficient World Models with Context-Aware Tokenization. ICML 2024☆115Updated last year
- minimal Energy-based transformer☆42Updated 3 weeks ago
- ☆211Updated 4 months ago
- Efficient baselines for autocurricula in JAX.☆205Updated last year
- ☆287Updated last year
- Minimal yet performant LLM examples in pure JAX☆223Updated 3 weeks ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆181Updated 6 months ago
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆132Updated last year
- ☆82Updated last year