kvfrans / lmpoLinks
☆134Updated last month
Alternatives and similar repositories for lmpo
Users that are interested in lmpo are comparing it to the libraries listed below
Sorting:
- A simple, performant and scalable JAX-based world modeling codebase.☆122Updated this week
- RLP: Reinforcement as a Pretraining Objective☆223Updated 3 months ago
- Benchmarking Agentic LLM and VLM Reasoning On Games☆225Updated last month
- Minimal but scalable implementation of large language models in JAX☆35Updated last month
- Learn online intrinsic rewards from LLM feedback☆45Updated last year
- XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning - - — ICLR 2025☆81Updated 11 months ago
- Synchronized Curriculum Learning for RL Agents☆118Updated 2 months ago
- Cost aware hyperparameter tuning algorithm☆177Updated last year
- Latent Program Network (from the "Searching Latent Program Spaces" paper)☆107Updated last month
- This repo contains the source code for the paper "Evolution Strategies at Scale: LLM Fine-Tuning Beyond Reinforcement Learning"☆285Updated last month
- OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code (ICLR 2025).☆72Updated last year
- Flax (Jax) implementation of DeepSeek-R1-Distill-Qwen-1.5B with weights ported from Hugging Face.☆26Updated 11 months ago
- ☆116Updated last week
- NanoGPT-speedrunning for the poor T4 enjoyers☆73Updated 8 months ago
- 📄Small Batch Size Training for Language Models☆79Updated 3 months ago
- The Automated LLM Speedrunning Benchmark measures how well LLM agents can reproduce previous innovations and discover new ones in languag…☆127Updated 3 months ago
- Efficient World Models with Context-Aware Tokenization. ICML 2024☆115Updated last year
- nanoGRPO is a lightweight implementation of Group Relative Policy Optimization (GRPO)☆140Updated 8 months ago
- Minimal yet performant LLM examples in pure JAX☆230Updated last week
- ☆123Updated 7 months ago
- Implementation of the new SOTA for model based RL, from the paper "Improving Transformer World Models for Data-Efficient RL", in Pytorch☆150Updated 8 months ago
- The simplest, fastest repository for training/finetuning medium-sized GPTs.☆185Updated 6 months ago
- A scalable asynchronous reinforcement learning implementation with in-flight weight updates.☆346Updated 3 weeks ago
- Custom triton kernels for training Karpathy's nanoGPT.☆19Updated last year
- rl from zero pretrain, can it be done? yes.☆286Updated 3 months ago
- Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training☆132Updated last year
- minimal Energy-based transformer☆42Updated last month
- ☆213Updated 2 weeks ago
- Efficiently discovering algorithms via LLMs with evolutionary search and reinforcement learning.☆124Updated 2 months ago
- Intrinsic Motivation from Artificial Intelligence Feedback☆134Updated 2 years ago