yegcjs/mixinglaws

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/yegcjs/mixinglaws)

yegcjs / mixinglaws

☆113

Alternatives and similar repositories for mixinglaws

Users that are interested in mixinglaws are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

alon-albalak / online-data-mixing
View on GitHub
An implementation of online data mixing for the Pile dataset, based on the GPT-NeoX library.
☆14Jan 9, 2024Updated 2 years ago
feiyang-k / AutoScale
View on GitHub
Official Code Repository for [AutoScale📈: Scale-Aware Data Mixing for Pre-Training LLMs] Published as a conference paper at **COLM 2025*…
☆14Aug 8, 2025Updated 11 months ago
locuslab / scaling_laws_data_filtering
View on GitHub
☆64Apr 9, 2024Updated 2 years ago
sail-sg / regmix
View on GitHub
[ICLR 2025] 🧬 RegMix: Data Mixture as Regression for Language Model Pre-training (Spotlight)
☆195Feb 17, 2025Updated last year
choosewhatulike / cluster-clip
View on GitHub
Multi-GPU supported kmeans clustering for cluser-clip
☆15Jun 3, 2024Updated 2 years ago
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
cofe-ai / Mu-scaling
View on GitHub
Research without Re-search: Maximal Update Parametrization Yields Accurate Loss Prediction across Scales
☆32Jul 17, 2023Updated 3 years ago
yikangshen / megablocks
View on GitHub
☆20May 30, 2024Updated 2 years ago
Lyun0912-wu / LongAttn
View on GitHub
LongAttn ：Selecting Long-context Training Data via Token-level Attention
☆15Jul 16, 2025Updated last year
thu-wyz / inference_scaling
View on GitHub
☆80Nov 19, 2024Updated last year
sangmichaelxie / doremi
View on GitHub
Pytorch implementation of DoReMi, a method for optimizing the data mixture weights in language modeling datasets
☆357Dec 26, 2023Updated 2 years ago
yiqingxyq / RepoST
View on GitHub
Code for "[COLM'25] RepoST: Scalable Repository-Level Coding Environment Construction with Sandbox Testing"
☆24Mar 18, 2025Updated last year
john-hewitt / implicit-ins
View on GitHub
Codebase for Instruction Following without Instruction Tuning
☆36Sep 24, 2024Updated last year
sanyalsunny111 / LLM-Inheritune
View on GitHub
[TMLR 2025] When Attention Collapses: How Degenerate Layers in LLMs Enable Smaller, Stronger Models
☆126Mar 6, 2026Updated 4 months ago
yifanzhang-pro / AutoMathText
View on GitHub
[ACL 2025 Findings] Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts (https://huggingface.co/papers…
☆92Nov 23, 2025Updated 8 months ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
OpenLMLab / scaling-rope
View on GitHub
code for Scaling Laws of RoPE-based Extrapolation
☆73Oct 16, 2023Updated 2 years ago
princeton-nlp / LLM-Shearing
View on GitHub
[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
☆643Mar 4, 2024Updated 2 years ago
microsoft / rho
View on GitHub
Repo for Rho-1: Token-level Data Selection & Selective Pretraining of LLMs.
☆471Apr 18, 2024Updated 2 years ago
princeton-nlp / QuRating
View on GitHub
[ICML 2024] Selecting High-Quality Data for Training Language Models
☆204Dec 8, 2025Updated 7 months ago
MadryLab / DsDm
View on GitHub
☆53Jan 24, 2024Updated 2 years ago
huggingface / datablations
View on GitHub
Scaling Data-Constrained Language Models
☆345Jun 28, 2025Updated last year
HazyResearch / skill-it
View on GitHub
Skill-It! A Data-Driven Skills Framework for Understanding and Training Language Models
☆48Oct 31, 2023Updated 2 years ago
IBM / ColPret
View on GitHub
Efficient Scaling laws and collaborative pretraining.
☆23Jul 19, 2026Updated last week
multimodal-art-projection / MAP-NEO
View on GitHub
☆986Feb 7, 2025Updated last year
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
OpenMOSS / Thus-Spake-Long-Context-LLM
View on GitHub
a survey of long-context LLMs from four perspectives, architecture, infrastructure, training, and evaluation
☆62Mar 31, 2025Updated last year
lm-sys / llm-decontaminator
View on GitHub
Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"
☆325Dec 20, 2023Updated 2 years ago
hkust-nlp / llm-compression-intelligence
View on GitHub
Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]
☆150Sep 20, 2024Updated last year
yhcc / utcie
View on GitHub
This is the code repo for the paper <UTC-IE: A Unified Token-pair Classification Architecture for Information Extraction>
☆15Aug 10, 2023Updated 2 years ago
Olivia-fsm / DoGE
View on GitHub
Codebase for ICML submission "DOGE: Domain Reweighting with Generalization Estimation"
☆21Feb 29, 2024Updated 2 years ago
RESIN-KAIROS / RESIN-11
View on GitHub
☆20Jul 18, 2022Updated 4 years ago
NineAbyss / S2R
View on GitHub
This is the official implementation of the paper "S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning"
☆77Apr 22, 2025Updated last year
allenai / olmix
View on GitHub
☆41May 26, 2026Updated 2 months ago
RUCAIBox / BAMBOO
View on GitHub
☆36Mar 25, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
Jiachen-T-Wang / GREATS
View on GitHub
☆20Jun 27, 2026Updated last month
hkust-nlp / deita
View on GitHub
Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]
☆600Dec 9, 2024Updated last year
RUCAIBox / Slow_Thinking_with_LLMs
View on GitHub
A series of technical report on Slow Thinking with LLM
☆767Aug 13, 2025Updated 11 months ago
OpenLMLab / ParallelTokenizer
View on GitHub
Use the tokenizer in parallel to achieve superior acceleration
☆20Mar 21, 2024Updated 2 years ago
TencentARC / LLaMA-Pro
View on GitHub
[ACL 2024] Progressive LLaMA with Block Expansion.
☆513May 20, 2024Updated 2 years ago
sail-sg / SkyLadder
View on GitHub
The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling
☆43Dec 29, 2025Updated 7 months ago
limenlp / safer-instruct
View on GitHub
This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"
☆17Feb 22, 2024Updated 2 years ago