princeton-pli / MeCoLinks

Code for ICML 25 paper "Metadata Conditioning Accelerates Language Model Pre-training (MeCo)"

☆44

Alternatives and similar repositories for MeCo

Users that are interested in MeCo are comparing it to the libraries listed below

Sorting:

RUCAIBox / BAMBOO
☆35Updated last year
wwxu21 / CUT
Source code of "Reasons to Reject? Aligning Language Models with Judgments"
☆58Updated last year
chujiezheng / LLM-Extrapolation
Official repository for ACL 2025 paper "Model Extrapolation Expedites Alignment"
☆75Updated 5 months ago
yegcjs / mixinglaws
☆106Updated 3 months ago
ernie-research / Tool-Augmented-Reward-Model
[ICLR'24 spotlight] Tool-Augmented Reward Modeling
☆51Updated 4 months ago
john-hewitt / implicit-ins
Codebase for Instruction Following without Instruction Tuning
☆36Updated last year
gmftbyGMFTBY / Rep-Dropout
[NeurIPS 2023] Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective
☆36Updated 2 years ago
Leooyii / LCEG
Long Context Extension and Generalization in LLMs
☆62Updated last year
SynthLabsAI / big-math
A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models
☆65Updated 7 months ago
GAIR-NLP / benbench
Benchmarking Benchmark Leakage in Large Language Models
☆55Updated last year
DAMO-NLP-SG / CLEX
[ICLR 2024] CLEX: Continuous Length Extrapolation for Large Language Models
☆78Updated last year
HKUNLP / STRING
[ICLR'25] Data and code for our paper "Why Does the Effective Context Length of LLMs Fall Short?"
☆78Updated 10 months ago
locuslab / scaling_laws_data_filtering
☆65Updated last year
NormXU / Consistent-DynamicNTKRoPE
An Experiment on Dynamic NTK Scaling RoPE
☆64Updated last year
yifanzhang-pro / AutoMathText
[ACL 2025 Findings] Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts (As Huggingface Daily Papers: …
☆87Updated last month
OpenLMLab / ParallelTokenizer
Use the tokenizer in parallel to achieve superior acceleration
☆20Updated last year
CodeCreator / WebOrganizer
Organize the Web: Constructing Domains Enhances Pre-Training Data Curation
☆67Updated 5 months ago
googleinterns / localizing-paragraph-memorization
☆15Updated last year
kamanphoebe / Look-into-MoEs
[NAACL 2025] A Closer Look into Mixture-of-Experts in Large Language Models
☆55Updated 8 months ago
yiqingxyq / RepoST
Code for "[COLM'25] RepoST: Scalable Repository-Level Coding Environment Construction with Sandbox Testing"
☆22Updated 7 months ago
yuzhaouoe / pretraining-data-packing
[ACL'24 Oral] Analysing The Impact of Sequence Composition on Language Model Pre-Training
☆22Updated last year
chtmp223 / suri
Suri: Multi-constraint instruction following for long-form text generation (EMNLP’24)
☆26Updated 3 weeks ago
QwenLM / online_merging_optimizers
Implementations of online merging optimizers proposed by Online Merging Optimizers for Boosting Rewards and Mitigating Tax in Alignment
☆77Updated last year
AntNLP / nope_head_scale
☆26Updated last year
wxjiao / InstructMT
A collection of instruction data and scripts for machine translation.
☆20Updated 2 years ago
cxcscmu / MATES
Official repository for MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models [NeurIPS 2024]
☆74Updated 11 months ago
hkust-nlp / llm-compression-intelligence
Official github repo for the paper "Compression Represents Intelligence Linearly" [COLM 2024]
☆142Updated last year
ChengpengLi1003 / DotaMath
☆30Updated 9 months ago
Shwai-He / MEO
The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":
☆38Updated last year
GAIR-NLP / MetaCritique
Evaluate the Quality of Critique
☆36Updated last year