Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.
โ465Apr 18, 2024Updated last year
Alternatives and similar repositories for rho
Users that are interested in rho are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. ๐งฎโจโ274Apr 26, 2024Updated last year
- [ACL 2025 Findings] Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts (https://huggingface.co/papersโฆโ90Nov 23, 2025Updated 4 months ago
- [ICML 2024] Selecting High-Quality Data for Training Language Modelsโ201Dec 8, 2025Updated 4 months ago
- Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]โ146Sep 20, 2024Updated last year
- โ323Sep 18, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways โข AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]โ592Dec 9, 2024Updated last year
- Official repository for MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models [NeurIPS 2024]โ79Nov 14, 2024Updated last year
- โ30Dec 27, 2024Updated last year
- โ567Nov 20, 2024Updated last year
- The code and data for the paper JiuZhang3.0โ49May 26, 2024Updated last year
- โ64Apr 9, 2024Updated 2 years ago
- [NeurIPS'24] Official code for *๐ฏDART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*โ121Dec 10, 2024Updated last year
- ToRA is a series of Tool-integrated Reasoning LLM Agents designed to solve challenging mathematical reasoning problems by interacting witโฆโ1,113Feb 22, 2024Updated 2 years ago
- Official Repo for Open-Reasoner-Zeroโ2,089Jun 2, 2025Updated 10 months ago
- GPU virtual machines on DigitalOcean Gradient AI โข AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Code for Quiet-STaRโ741Aug 21, 2024Updated last year
- [ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuningโ521Oct 20, 2024Updated last year
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"โ48Jan 17, 2024Updated 2 years ago
- [ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"โ450Oct 16, 2024Updated last year
- โ109Jul 15, 2025Updated 8 months ago
- Simple RL training for reasoningโ3,846Dec 23, 2025Updated 3 months ago
- [ACL 2024] Progressive LLaMA with Block Expansion.โ513May 20, 2024Updated last year
- Scaling Data-Constrained Language Modelsโ343Jun 28, 2025Updated 9 months ago
- AllenAI's post-training codebaseโ3,677Updated this week
- 1-Click AI Models by DigitalOcean Gradient โข AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Implementation of paper Data Engineering for Scaling Language Models to 128K Contextโ497Mar 19, 2024Updated 2 years ago
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]โ149Oct 27, 2024Updated last year
- Data and tools for generating and inspecting OLMo pre-training data.โ1,472Nov 5, 2025Updated 5 months ago
- A Survey on Data Selection for Language Modelsโ255Apr 29, 2025Updated 11 months ago
- Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"โ392Jan 19, 2025Updated last year
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuningโ188Jun 25, 2025Updated 9 months ago
- [ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruningโ643Mar 4, 2024Updated 2 years ago
- Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"โ319Dec 20, 2023Updated 2 years ago
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.โ2,978Apr 2, 2026Updated last week
- Managed Kubernetes at scale on DigitalOcean โข AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- โ198Jul 13, 2024Updated last year
- โ24Oct 14, 2024Updated last year
- [NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other moโฆโ413Jun 25, 2025Updated 9 months ago
- Code and data for "MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning" [ICLR 2024]โ384Aug 25, 2024Updated last year
- An implementation of online data mixing for the Pile dataset, based on the GPT-NeoX library.โ14Jan 9, 2024Updated 2 years ago
- Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.โ755Sep 27, 2024Updated last year
- An unofficial implementation of "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"โ36Jun 7, 2024Updated last year