xiatingyu / SFT-DataSelection-at-scaleLinks

☆30

Alternatives and similar repositories for SFT-DataSelection-at-scale

Users that are interested in SFT-DataSelection-at-scale are comparing it to the libraries listed below

Sorting:

sail-sg / sdft
[ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".
☆132Updated 11 months ago
hahahawu / Long-to-Short-via-Model-Merging
Model merging is a highly efficient approach for long-to-short reasoning.
☆86Updated 4 months ago
tianyi-lab / Superfiltering
[ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning
☆178Updated 3 months ago
RUCKBReasoning / CoT-based-Synthesizer
Official code implementation for the ACL 2025 paper: 'CoT-based Synthesizer: Enhancing LLM Performance through Answer Synthesis'
☆30Updated 4 months ago
SparkJiao / dpo-trajectory-reasoning
[EMNLP 2024] Source code for the paper "Learning Planning-based Reasoning with Trajectory Collection and Process Rewards Synthesizing".
☆82Updated 9 months ago
yyDing1 / ScaleQuest
[ACL-25] We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.
☆68Updated 11 months ago
ZNLP / Language-Imbalance-Driven-Rewarding
[ICLR 2025] Language Imbalance Driven Rewarding for Multilingual Self-improving
☆22Updated last month
princeton-nlp / CEPE
[ACL 2024] Long-Context Language Modeling with Parallel Encodings
☆163Updated last year
sail-sg / regmix
[ICLR 2025] 🧬 RegMix: Data Mixture as Regression for Language Model Pre-training (Spotlight)
☆171Updated 7 months ago
ZitongYang / Synthetic_Continued_Pretraining
Code implementation of synthetic continued pretraining
☆133Updated 9 months ago
songmzhang / DSKD
Repo for the EMNLP'24 Paper "Dual-Space Knowledge Distillation for Large Language Models". A general white-box KD framework for both same…
☆60Updated last month
ZHZisZZ / weak-to-strong-search
[NeurIPS'24] Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models
☆62Updated 10 months ago
princeton-nlp / QuRating
[ICML 2024] Selecting High-Quality Data for Training Language Models
☆189Updated last year
Zanette-Labs / efficient-reasoning
☆67Updated 6 months ago
hemingkx / TokenSkip
[EMNLP 2025] TokenSkip: Controllable Chain-of-Thought Compression in LLMs
☆182Updated 3 months ago
GCYZSL / MoLA
☆159Updated last year
junkangwu / beta-DPO
[NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$
☆49Updated 11 months ago
TianHongZXY / RLVR-Decomposed
[NeurIPS 2025] Implementation for the paper "The Surprising Effectiveness of Negative Reinforcement in LLM Reasoning"
☆112Updated last month
QwenLM / ProcessBench
Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"
☆172Updated 4 months ago
bethgelab / sober-reasoning
A Sober Look at Language Model Reasoning
☆84Updated this week
THU-KEG / AdaptThink
☆155Updated 4 months ago
OpenMOSS / Say-I-Dont-Know
[ICML'2024] Can AI Assistants Know What They Don't Know?
☆83Updated last year
KbsdJames / Omni-MATH
The official repository of the Omni-MATH benchmark.
☆88Updated 9 months ago
inclusionAI / PromptCoT
A unified suite for generating elite reasoning problems and training high-performance LLMs, including pioneering attention-free architect…
☆114Updated 2 weeks ago
GeniusHTX / TALE
☆133Updated last month
cxcscmu / MATES
Official repository for MATES: Model-Aware Data Selection for Efficient Pretraining with Data Influence Models [NeurIPS 2024]
☆74Updated 10 months ago
yegcjs / mixinglaws
☆105Updated 2 months ago
THU-KEG / RM-Bench
[ICLR 25 Oral] RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style
☆63Updated 2 months ago
nick7nlp / Counting-Stars
Counting-Stars (★)
☆83Updated 4 months ago
October2001 / ProLong
[ACL 2024 (Oral)] A Prospector of Long-Dependency Data for Large Language Models
☆56Updated last year