MLGroup-JLU / LLM-data-aug-survey
The official GitHub page for the survey paper "A Survey on Data Augmentation in Large Model Era"
☆121Updated 7 months ago
Alternatives and similar repositories for LLM-data-aug-survey:
Users that are interested in LLM-data-aug-survey are comparing it to the libraries listed below
- ☆122Updated last year
- Real-time updated, fine-grained reading list on LLM-synthetic-data.🔥☆228Updated last month
- [ACL 2024] The official codebase for the paper "Self-Distillation Bridges Distribution Gap in Language Model Fine-tuning".☆114Updated 4 months ago
- Official github repo for AutoDetect, an automated weakness detection framework for LLMs.☆41Updated 8 months ago
- [ACM Computing Surveys 2025] This repository collects awesome survey, resource, and paper for Lifelong Learning with Large Language Model…☆104Updated last month
- Fantastic Data Engineering for Large Language Models☆75Updated 2 months ago
- ☆44Updated 8 months ago
- ☆96Updated 9 months ago
- Official Repository for SIGIR2024 Demo Paper "An Integrated Data Processing Framework for Pretraining Foundation Models"☆70Updated 6 months ago
- A Toolkit for Table-based Question Answering☆109Updated last year
- A curated reading list for large language model (LLM) alignment. Take a look at our new survey "Large Language Model Alignment: A Survey"…☆75Updated last year
- [SIGIR'24] The official implementation code of MOELoRA.☆147Updated 7 months ago
- [ACL'24] Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning☆143Updated 5 months ago
- This repo aims to record resource of role-playing abilities in LLMs, including dataset, paper, application, etc.☆104Updated 5 months ago
- ☆80Updated last year
- ☆140Updated 8 months ago
- ☆54Updated last month
- [ACL 2024 Demo] Official GitHub repo for UltraEval: An open source framework for evaluating foundation models.☆233Updated 4 months ago
- The demo, code and data of FollowRAG☆70Updated 2 months ago
- Repo for Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent☆232Updated 3 weeks ago
- ☆218Updated 3 weeks ago
- ☆27Updated 2 months ago
- ☆99Updated last month
- [ICML 2024] Selecting High-Quality Data for Training Language Models☆157Updated 8 months ago
- Token level visualization tools for large language models☆75Updated last month
- Offical Repo for "Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale"☆221Updated 2 weeks ago
- 珠算代码大模型(Abacus Code LLM)☆54Updated 5 months ago
- 顾名思义:手搓的RAG☆120Updated last year