yuleiqin / fantastic-data-engineeringLinks
Fantastic Data Engineering for Large Language Models
☆93Updated last year
Alternatives and similar repositories for fantastic-data-engineering
Users that are interested in fantastic-data-engineering are comparing it to the libraries listed below
Sorting:
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning☆184Updated 6 months ago
- [ICLR 2025] 🧬 RegMix: Data Mixture as Regression for Language Model Pre-training (Spotlight)☆181Updated 11 months ago
- Collection of papers for scalable automated alignment.☆93Updated last year
- ☆87Updated 2 years ago
- Benchmarking Complex Instruction-Following with Multiple Constraints Composition (NeurIPS 2024 Datasets and Benchmarks Track)☆99Updated 10 months ago
- [ACL 2025] We introduce ScaleQuest, a scalable, novel and cost-effective data synthesis method to unleash the reasoning capability of LLM…