MLGroup-JLU / LLM-data-aug-survey
The official GitHub page for the survey paper "A Survey on Data Augmentation in Large Model Era"
☆123Updated 10 months ago
Alternatives and similar repositories for LLM-data-aug-survey
Users that are interested in LLM-data-aug-survey are comparing it to the libraries listed below
Sorting:
- ☆123Updated last year
- This repo aims to record resource of role-playing abilities in LLMs, including dataset, paper, application, etc.☆121Updated 7 months ago
- A curated reading list for large language model (LLM) alignment. Take a look at our new survey "Large Language Model Alignment: A Survey"…☆80Updated last year
- Fantastic Data Engineering for Large Language Models☆87Updated 4 months ago
- Official Repository for SIGIR2024 Demo Paper "An Integrated Data Processing Framework for Pretraining Foundation Models"☆80Updated 8 months ago
- [ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".☆119Updated 6 months ago
- Real-time updated, fine-grained reading list on LLM-synthetic-data.🔥☆256Updated 3 months ago
- A Toolkit for Table-based Question Answering☆112Updated last year
- ☆52Updated 8 months ago
- [SIGIR'24] The official implementation code of MOELoRA.☆162Updated 9 months ago
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning☆151Updated 8 months ago
- A curated list of awesome works in Routing LLMs paradigm (👉 Welcome to submit your contributions to this code repository)☆33Updated last month
- ☆63Updated 3 months ago
- Latest Advances on Long Chain-of-Thought Reasoning☆298Updated last month
- Official github repo for AutoDetect, an automated weakness detection framework for LLMs.☆42Updated 10 months ago
- [ACM Computing Surveys 2025] This repository collects awesome survey, resource, and paper for Lifelong Learning with Large Language Model…☆127Updated 3 months ago
- [EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"☆116Updated 7 months ago
- [ICML 2025] Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale☆245Updated 3 weeks ago
- Collect every awesome work about r1!☆363Updated 2 weeks ago
- ☆99Updated 11 months ago
- ☆64Updated 3 months ago
- ☆140Updated last year
- Project for the paper entitled `Instruction Tuning for Large Language Models: A Survey`☆176Updated 5 months ago
- ☆81Updated last year
- [Neurips2024] Source code for xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token☆137Updated 10 months ago
- ☆143Updated 10 months ago
- A Survey on Multimodal Retrieval-Augmented Generation☆172Updated 3 weeks ago
- This project aims to collect and collate various datasets for multimodal large model training, including but not limited to pre-training …☆43Updated last week
- The demo, code and data of FollowRAG☆72Updated 3 weeks ago
- ZO2 (Zeroth-Order Offloading): Full Parameter Fine-Tuning 175B LLMs with 18GB GPU Memory☆92Updated 2 weeks ago