kvfrans / lmpoView external linksLinks
☆136Dec 9, 2025Updated 2 months ago
Alternatives and similar repositories for lmpo
Users that are interested in lmpo are comparing it to the libraries listed below
Sorting:
- Minimal Transformer base in JAX. A single backbone for language modelling, diffusion, classification, etc...☆14May 28, 2025Updated 8 months ago
- Minimal but scalable implementation of large language models in JAX☆35Nov 28, 2025Updated 2 months ago
- Code to accompany the paper "The Information Geometry of Unsupervised Reinforcement Learning"☆20Oct 6, 2021Updated 4 years ago
- Answers to the questions at the back of the chapters of Advances in Financial Machine Learning.☆23Apr 11, 2020Updated 5 years ago
- ☆13Jul 2, 2025Updated 7 months ago
- Linking of legal documents to other legal documents.☆14Jun 2, 2022Updated 3 years ago
- [ACL 2025] Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models☆34Nov 4, 2025Updated 3 months ago
- ☆21May 20, 2025Updated 8 months ago
- Train a SmolLM-style llm on fineweb-edu in JAX/Flax with an assortment of optimizers.☆18Jul 24, 2025Updated 6 months ago
- ☆22May 5, 2025Updated 9 months ago
- The official implementation of "Horizon Reduction Makes RL Scalable"☆181Aug 2, 2025Updated 6 months ago
- Digital Trigger Unit for V2&V3 Gearbox☆15Jan 11, 2021Updated 5 years ago
- Custom triton kernels for training Karpathy's nanoGPT.☆19Oct 21, 2024Updated last year
- ☆16Oct 5, 2021Updated 4 years ago
- Crawl & visualize ICLR papers and reviews.☆18Nov 5, 2022Updated 3 years ago
- Code to accompany the paper "Mismatched No More: Joint Model-Policy Optimization for Model-Based RL"☆20Oct 6, 2021Updated 4 years ago
- [AAMAS'26] xTED: Cross-Domain Adaptation via Diffusion-Based Trajectory Editing☆23Jan 8, 2026Updated last month
- Code for "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining"☆26Oct 14, 2025Updated 4 months ago
- Minimal yet performant LLM examples in pure JAX☆240Jan 14, 2026Updated last month
- This is the official implementation for the paper "Learning to Scaffold: Optimizing Model Explanations for Teaching"☆19May 19, 2022Updated 3 years ago
- Explaining neural decisions contrastively to alternative decisions.☆25Mar 18, 2021Updated 4 years ago
- ☆84May 31, 2025Updated 8 months ago
- COOM: Benchmarking Continual Reinforcement Learning on Doom☆20Jan 4, 2026Updated last month
- A dataloader, but for JAX☆20May 17, 2024Updated last year
- Drop-in environment replacements that make your RL algorithm train faster.☆21Jun 19, 2024Updated last year
- ☆124Jun 11, 2025Updated 8 months ago
- Code for paper "Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial Explanations of Their Behavior in Natural Language?"☆22Oct 13, 2020Updated 5 years ago
- Distributed pretraining of large language models (LLMs) on cloud TPU slices, with Jax and Equinox.☆24Sep 29, 2024Updated last year
- Storing long contexts in tiny caches with self-study☆236Dec 5, 2025Updated 2 months ago
- The official implementation of flow Q-learning (FQL)☆275Jul 21, 2025Updated 6 months ago
- Official Implementation of NeurIPS'23 Paper "Cross-Episodic Curriculum for Transformer Agents"☆31Oct 12, 2023Updated 2 years ago
- Verifiers for LLM Reinforcement Learning☆80Apr 15, 2025Updated 9 months ago
- [ICLR 2024] Closing the Gap between TD Learning and Supervised Learning - A Generalisation Point of View.☆23Apr 19, 2024Updated last year
- Async pipelined version of Verl☆124Apr 8, 2025Updated 10 months ago
- [ICLR 2023] PyTorch code of Summarization Programs: Interpretable Abstractive Summarization with Neural Modular Trees☆24Jun 19, 2023Updated 2 years ago
- Flow-matching algorithms in JAX☆115Aug 12, 2024Updated last year
- A benchmark for offline goal-conditioned RL and offline RL☆331Jan 14, 2026Updated 3 weeks ago
- "Deriving Machine Attention from Human Rationales" EMNLP 2018☆27Feb 15, 2019Updated 6 years ago
- The official Python SDK for the Perceptron API☆60Jan 23, 2026Updated 3 weeks ago