Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.
โ466Apr 18, 2024Updated 2 years ago
Alternatives and similar repositories for rho
Users that are interested in rho are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A simple toolkit for benchmarking LLMs on mathematical reasoning tasks. ๐งฎโจโ276Apr 26, 2024Updated 2 years ago
- [ACL 2025 Findings] Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts (https://huggingface.co/papersโฆโ91Nov 23, 2025Updated 5 months ago
- [ICML 2024] Selecting High-Quality Data for Training Language Modelsโ202Dec 8, 2025Updated 5 months ago
- Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]โ147Sep 20, 2024Updated last year
- โ323Sep 18, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer โข AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]โ596Dec 9, 2024Updated last year
- Official repository for MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models [NeurIPS 2024]โ79Nov 14, 2024Updated last year
- โ30Dec 27, 2024Updated last year
- โ569Nov 20, 2024Updated last year
- The code and data for the paper JiuZhang3.0โ49May 26, 2024Updated last year
- โ64Apr 9, 2024Updated 2 years ago
- [NeurIPS'24] Official code for *๐ฏDART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving*โ122Dec 10, 2024Updated last year
- ToRA is a series of Tool-integrated Reasoning LLM Agents designed to solve challenging mathematical reasoning problems by interacting witโฆโ1,117Feb 22, 2024Updated 2 years ago
- Official Repo for Open-Reasoner-Zeroโ2,091Jun 2, 2025Updated 11 months ago
- Bare Metal GPUs on DigitalOcean Gradient AI โข AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Code for Quiet-STaRโ741Aug 21, 2024Updated last year
- Code for the arXiv preprint "The Unreasonable Effectiveness of Easy Training Data"โ48Jan 17, 2024Updated 2 years ago
- [ICML 2024] LESS: Selecting Influential Data for Targeted Instruction Tuningโ528Oct 20, 2024Updated last year
- โ110Jul 15, 2025Updated 10 months ago
- [ICML'24] Data and code for our paper "Training-Free Long-Context Scaling of Large Language Models"โ450Oct 16, 2024Updated last year
- Simple RL training for reasoningโ3,859Dec 23, 2025Updated 4 months ago
- [ACL 2024] Progressive LLaMA with Block Expansion.โ515May 20, 2024Updated 2 years ago
- Scaling Data-Constrained Language Modelsโ343Jun 28, 2025Updated 10 months ago
- AllenAI's post-training codebaseโ3,726Updated this week
- Deploy to Railway using AI coding agents - Free Credits Offer โข AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Implementation of paper Data Engineering for Scaling Language Models to 128K Contextโ496Mar 19, 2024Updated 2 years ago
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]โ149Oct 27, 2024Updated last year
- A Survey on Data Selection for Language Modelsโ259Apr 29, 2025Updated last year
- Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"โ396Jan 19, 2025Updated last year
- Data and tools for generating and inspecting OLMo pre-training data.โ1,497Nov 5, 2025Updated 6 months ago
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuningโ189Jun 25, 2025Updated 10 months ago
- [ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruningโ643Mar 4, 2024Updated 2 years ago
- Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"โ321Dec 20, 2023Updated 2 years ago
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.โ3,058May 6, 2026Updated 2 weeks ago
- Deploy on Railway without the complexity - Free Credits Offer โข AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- โ23Oct 14, 2024Updated last year
- [NAACL'24] Self-data filtering of LLM instruction-tuning data using a novel perplexity-based difficulty score, without using any other moโฆโ416Jun 25, 2025Updated 10 months ago
- โ204Jul 13, 2024Updated last year
- An implementation of online data mixing for the Pile dataset, based on the GPT-NeoX library.โ14Jan 9, 2024Updated 2 years ago
- Code and data for "MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning" [ICLR 2024]โ386Aug 25, 2024Updated last year
- Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.โ759Sep 27, 2024Updated last year
- An unofficial implementation of "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"โ36Jun 7, 2024Updated last year