MLGroup-JLU / LLM-data-aug-surveyLinks
The official GitHub page for the survey paper "A Survey on Data Augmentation in Large Model Era"
☆128Updated last year
Alternatives and similar repositories for LLM-data-aug-survey
Users that are interested in LLM-data-aug-survey are comparing it to the libraries listed below
Sorting:
- ☆125Updated last year
- A Toolkit for Table-based Question Answering☆113Updated last year
- This repo aims to record resource of role-playing abilities in LLMs, including dataset, paper, application, etc.☆129Updated 11 months ago
- A curated reading list for large language model (LLM) alignment. Take a look at our new survey "Large Language Model Alignment: A Survey"…☆80Updated last year
- ☆100Updated last year
- Awesome-Large-Search-Models is a collection of papers and resources (Methods, Datasets and other resources) about awesome agentic search …☆118Updated last week
- A curated list of Awesome-LLM-Ensemble papers for the survey "Harnessing Multiple Large Language Models: A Survey on LLM Ensemble"☆111Updated last month
- Fantastic Data Engineering for Large Language Models☆90Updated 7 months ago
- [ACM Computing Surveys 2025] This repository collects awesome survey, resource, and paper for Lifelong Learning with Large Language Model…☆143Updated 2 months ago
- ☆54Updated 11 months ago
- [ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".☆126Updated 9 months ago
- Collect every awesome work about r1!☆412Updated 3 months ago
- ☆71Updated 7 months ago
- This project aims to collect and collate various datasets for multimodal large model training, including but not limited to pre-training …☆55Updated 3 months ago
- ☆145Updated last year
- [SIGIR'24] The official implementation code of MOELoRA.☆174Updated last year
- Self-Evolved Diverse Data Sampling for Efficient Instruction Tuning☆83Updated last year
- Project for the paper entitled `Instruction Tuning for Large Language Models: A Survey`☆182Updated 2 weeks ago
- Our code for ICLR'25 paper "DataMan: Data Manager for Pre-training Large Language Models".☆100Updated last week
- A live reading list for LLM data synthesis (Updated to July, 2025).☆360Updated last week
- AI Alignment: A Comprehensive Survey☆135Updated last year
- ☆67Updated 6 months ago
- [ACL 2024 Demo] Official GitHub repo for UltraEval: An open source framework for evaluating foundation models.☆246Updated 9 months ago
- Official Repository for SIGIR2024 Demo Paper "An Integrated Data Processing Framework for Pretraining Foundation Models"☆82Updated last year
- LongQLoRA: Extent Context Length of LLMs Efficiently☆166Updated last year
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning☆170Updated 2 months ago
- Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models☆114Updated 2 months ago
- Awesome LLM pre-training resources, including data, frameworks, and methods.☆230Updated 3 months ago
- ☆50Updated last year
- A curated list of awesome works in Routing LLMs paradigm (👉 Welcome to submit your contributions to this code repository)☆52Updated last month