kvfrans / lmpoView external linksLinks
☆136Dec 9, 2025Updated 2 months ago
Alternatives and similar repositories for lmpo
Users that are interested in lmpo are comparing it to the libraries listed below
Sorting:
- Minimal Transformer base in JAX. A single backbone for language modelling, diffusion, classification, etc...☆14May 28, 2025Updated 8 months ago
- Minimal but scalable implementation of large language models in JAX☆35Nov 28, 2025Updated 2 months ago
- Code to accompany the paper "The Information Geometry of Unsupervised Reinforcement Learning"☆20Oct 6, 2021Updated 4 years ago
- A tiny reinforcement learning codebase for continuous control, built on top of JAX.☆15Mar 28, 2023Updated 2 years ago
- ☆13Jul 2, 2025Updated 7 months ago
- Offline RL algoritms implemented in Stable Baselines3 (pytorch)☆10Dec 7, 2021Updated 4 years ago
- Linking of legal documents to other legal documents.☆14Jun 2, 2022Updated 3 years ago
- Code for Tackling Long-Horizon Tasks with Model-based Offline Reinforcement Learning☆15Feb 6, 2025Updated last year
- Continual Memorization of Factoids in Large Language Models☆12Nov 20, 2024Updated last year
- Pointax: PointMaze Environment for JAX☆26Oct 22, 2025Updated 3 months ago
- ☆42Jan 24, 2026Updated 3 weeks ago
- ☆21May 20, 2025Updated 8 months ago
- [ACL 2025] Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models☆34Nov 4, 2025Updated 3 months ago
- ☆22May 5, 2025Updated 9 months ago
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆18Jul 24, 2025Updated 6 months ago
- The official implementation of "Horizon Reduction Makes RL Scalable"☆181Aug 2, 2025Updated 6 months ago
- Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"☆593Oct 7, 2025Updated 4 months ago
- MR.Q is a general-purpose model-free reinforcement learning algorithm.☆140Jun 23, 2025Updated 7 months ago
- Code for Evaluating Explanations for Reading Comprehension with Realistic Counterfactuals.☆18Apr 25, 2021Updated 4 years ago
- Crawl & visualize ICLR papers and reviews.☆18Nov 5, 2022Updated 3 years ago
- ☆16Oct 5, 2021Updated 4 years ago
- Custom triton kernels for training Karpathy's nanoGPT.☆19Oct 21, 2024Updated last year
- Code to accompany the paper "Mismatched No More: Joint Model-Policy Optimization for Model-Based RL"☆20Oct 6, 2021Updated 4 years ago
- Code for "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining"☆26Oct 14, 2025Updated 4 months ago
- Minimal yet performant LLM examples in pure JAX☆240Jan 14, 2026Updated last month
- This is the official implementation for the paper "Learning to Scaffold: Optimizing Model Explanations for Teaching"☆19May 19, 2022Updated 3 years ago
- Explaining neural decisions contrastively to alternative decisions.☆25Mar 18, 2021Updated 4 years ago
- ☆84May 31, 2025Updated 8 months ago
- AdaSplash: Adaptive Sparse Flash Attention (aka Flash Entmax Attention)☆32Sep 30, 2025Updated 4 months ago
- ☆124Jun 11, 2025Updated 8 months ago
- Drop-in environment replacements that make your RL algorithm train faster.☆21Jun 19, 2024Updated last year
- A dataloader, but for JAX☆20May 17, 2024Updated last year
- Scalable Opponent Shaping Experiments in JAX☆25Apr 13, 2024Updated last year
- COOM: Benchmarking Continual Reinforcement Learning on Doom☆20Jan 4, 2026Updated last month
- Code for paper "Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial Explanations of Their Behavior in Natural Language?"☆22Oct 13, 2020Updated 5 years ago
- Distributed pretraining of large language models (LLMs) on cloud TPU slices, with Jax and Equinox.☆24Sep 29, 2024Updated last year
- Storing long contexts in tiny caches with self-study☆237Dec 5, 2025Updated 2 months ago
- Foundation Policies with Hilbert Representations (ICML 2024)☆105Sep 29, 2025Updated 4 months ago
- ReCross: Unsupervised Cross-Task Generalization via Retrieval Augmentation☆24May 1, 2022Updated 3 years ago