TheDuckAI / prm
☆12Updated 2 months ago
Alternatives and similar repositories for prm:
Users that are interested in prm are comparing it to the libraries listed below
- ☆21Updated 6 months ago
- unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"☆76Updated 2 years ago
- TPU pod commander is a package for managing and launching jobs on Google Cloud TPU pods.☆20Updated 9 months ago
- Interpreting how transformers simulate agents performing RL tasks☆78Updated last year
- [ICLR 2025] "Training LMs on Synthetic Edit Sequences Improves Code Synthesis" (Piterbarg, Pinto, Fergus)☆17Updated last month
- Scaling scaling laws with board games.☆48Updated last year
- Machine Learning eXperiment Utilities☆46Updated 9 months ago
- Minimal but scalable implementation of large language models in JAX☆34Updated 4 months ago
- A toolkit for scaling law research ⚖☆49Updated 2 months ago
- Redwood Research's transformer interpretability tools☆14Updated 2 years ago
- ☆84Updated 8 months ago
- ☆16Updated last year
- PyTorch Package For Quasimetric Learning☆41Updated 4 months ago
- ☆34Updated 2 years ago
- Language models scale reliably with over-training and on downstream tasks☆96Updated 11 months ago
- ☆81Updated 8 months ago
- Learn online intrinsic rewards from LLM feedback☆35Updated 3 months ago
- ☆31Updated 3 months ago
- Transformer with Mu-Parameterization, implemented in Jax/Flax. Supports FSDP on TPU pods.☆30Updated 3 months ago
- Mechanistic Interpretability for Transformer Models☆50Updated 2 years ago
- A library to create and manage configuration files, especially for machine learning projects.☆77Updated 3 years ago
- Advantage Leftover Lunch Reinforcement Learning (A-LoL RL): Improving Language Models with Advantage-based Offline Policy Gradients☆26Updated 6 months ago
- General Modules for JAX☆64Updated last month
- ☆30Updated 4 months ago
- Repository for the code of the "PPL-MCTS: Constrained Textual Generation Through Discriminator-Guided Decoding" paper, NAACL'22☆64Updated 2 years ago
- Official code from the paper "Offline RL for Natural Language Generation with Implicit Language Q Learning"☆205Updated last year
- Sparse and discrete interpretability tool for neural networks☆59Updated last year
- ☆25Updated 11 months ago
- Simple and efficient pytorch-native transformer training and inference (batched)☆71Updated 11 months ago
- ☆14Updated 4 months ago