pengr / DataManLinks
Our code for ICLR'25 paper "DataMan: Data Manager for Pre-training Large Language Models".
β88Updated last week
Alternatives and similar repositories for DataMan
Users that are interested in DataMan are comparing it to the libraries listed below
Sorting:
- [ICLR 2025] 𧬠RegMix: Data Mixture as Regression for Language Model Pre-training (Spotlight)β160Updated 5 months ago
- [ICML 2024] Selecting High-Quality Data for Training Language Modelsβ185Updated last year
- β206Updated 5 months ago
- Model merging is a highly efficient approach for long-to-short reasoning.β77Updated 2 months ago
- [ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".β126Updated 9 months ago
- Pre-trained, Scalable, High-performance Reward Models via Policy Discriminative Learning.β146Updated last month
- xVerify: Efficient Answer Verifier for Reasoning Model Evaluationsβ127Updated 3 months ago
- Extrapolating RLVR to General Domains without Verifiersβ136Updated 2 weeks ago
- β102Updated 2 months ago
- [ACL 2024] MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialoguesβ105Updated last year
- A Comprehensive Survey on Long Context Language Modelingβ170Updated last month
- β149Updated last year
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuningβ168Updated last month
- TokenSkip: Controllable Chain-of-Thought Compression in LLMsβ171Updated last month
- β159Updated 3 months ago
- A unified suite for generating elite reasoning problems and training high-performance LLMs, including pioneering attention-free architectβ¦β64Updated 2 months ago
- "what, how, where, and how well? a survey on test-time scaling in large language models" repositoryβ57Updated last week
- β268Updated 2 months ago
- [ACL 2024] Long-Context Language Modeling with Parallel Encodingsβ157Updated last year
- A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.β245Updated 4 months ago
- CPPO: Accelerating the Training of Group Relative Policy Optimization-Based Reasoning Modelsβ148Updated 2 months ago
- β105Updated 2 months ago
- [SIGIR'24] The official implementation code of MOELoRA.β175Updated last year
- Fantastic Data Engineering for Large Language Modelsβ89Updated 7 months ago
- β183Updated last year
- β255Updated last month
- a-m-team's exploration in large language modelingβ182Updated 2 months ago
- β104Updated last month
- Code implementation of synthetic continued pretrainingβ123Updated 7 months ago
- [ACL-25] We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLMs.β63Updated 9 months ago