MLGroup-JLU / LLM-data-aug-surveyLinks
The official GitHub page for the survey paper "A Survey on Data Augmentation in Large Model Era"
☆131Updated last year
Alternatives and similar repositories for LLM-data-aug-survey
Users that are interested in LLM-data-aug-survey are comparing it to the libraries listed below
Sorting:
- ☆125Updated last year
- ☆54Updated last year
- This repo aims to record resource of role-playing abilities in LLMs, including dataset, paper, application, etc.☆133Updated last year
- ☆102Updated last year
- A Toolkit for Table-based Question Answering☆115Updated 2 years ago
- Fantastic Data Engineering for Large Language Models☆92Updated 11 months ago
- [SIGIR'24] The official implementation code of MOELoRA.☆185Updated last year
- A curated reading list for large language model (LLM) alignment. Take a look at our new survey "Large Language Model Alignment: A Survey"…☆81Updated 2 years ago
- ☆77Updated 10 months ago
- This project aims to collect and collate various datasets for multimodal large model training, including but not limited to pre-training …☆59Updated 7 months ago
- Collect every awesome work about r1!☆425Updated 7 months ago
- [ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".☆138Updated last year
- [ACM Computing Surveys 2025] This repository collects awesome survey, resource, and paper for Lifelong Learning with Large Language Model…☆158Updated 6 months ago
- A live reading list for LLM data synthesis (Updated to July, 2025).☆420Updated 3 months ago
- Official Repository for SIGIR2024 Demo Paper "An Integrated Data Processing Framework for Pretraining Foundation Models"☆84Updated last year
- Project for the paper entitled `Instruction Tuning for Large Language Models: A Survey`☆208Updated 4 months ago
- ☆87Updated last year
- ☆147Updated last year
- 本项目用于大模型数学解题能力方面的数据集合成,模型训练及评测,相关文章记录。☆97Updated last year
- LLM Zoo collects information of various open- and close-sourced LLMs☆271Updated 2 years ago
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning☆184Updated 5 months ago
- Scaling Preference Data Curation via Human-AI Synergy☆132Updated 5 months ago
- ☆50Updated last year
- [Neurips2024] Source code for xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token☆164Updated last year
- A curated list of papers and applications on tool learning.☆123Updated last year
- Our code for ICLR'25 paper "DataMan: Data Manager for Pre-training Large Language Models".☆108Updated 3 months ago
- 1st Solution For Conversational Multi-Doc QA Workshop & International Challenge @ WSDM'24 - Xiaohongshu.Inc☆162Updated 4 months ago
- Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models☆120Updated last month
- Counting-Stars (★)☆83Updated 2 weeks ago
- ☆122Updated last year