TheDuckAI / prm
☆12Updated this week
Related projects ⓘ
Alternatives and complementary repositories for prm
- ☆18Updated last month
- ☆25Updated 3 weeks ago
- Scaling scaling laws with board games.☆43Updated last year
- unofficial re-implementation of "Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets"☆63Updated 2 years ago
- Minimal but scalable implementation of large language models in JAX☆26Updated 2 weeks ago
- Unofficial but Efficient Implementation of "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" in JAX☆79Updated 9 months ago
- ☆15Updated 9 months ago
- A library to create and manage configuration files, especially for machine learning projects.☆77Updated 2 years ago
- Language models scale reliably with over-training and on downstream tasks☆94Updated 7 months ago
- Advantage Leftover Lunch Reinforcement Learning (A-LoL RL): Improving Language Models with Advantage-based Offline Policy Gradients☆26Updated 2 months ago
- Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"☆80Updated last week
- ☆73Updated 4 months ago
- Simple and efficient pytorch-native transformer training and inference (batched)☆61Updated 7 months ago
- ☆45Updated 9 months ago
- If it quacks like a tensor...☆52Updated last week
- Sparse and discrete interpretability tool for neural networks☆55Updated 9 months ago
- ☆24Updated 7 months ago
- ☆50Updated 6 months ago
- ☆77Updated 3 months ago
- ☆29Updated 2 months ago
- Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision☆97Updated 2 months ago
- Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from e…☆25Updated 5 months ago
- ☆34Updated last year
- [ICML 2024] Official code release accompanying the paper "diff History for Neural Language Agents" (Piterbarg, Pinto, Fergus)☆18Updated 3 months ago
- NEVIS'22: Benchmarking the next generation of never-ending learners☆98Updated last year
- Fast training of unitary deep network layers from low-rank updates☆28Updated last year
- ☆20Updated 11 months ago
- ☆53Updated 3 weeks ago
- PyTorch Package For Quasimetric Learning☆42Updated 3 weeks ago