gszfwsb / Awesome-Dataset-Reduction
A curated list of awesome papers on dataset reduction, including dataset distillation (dataset condensation) and dataset pruning (coreset selection).
☆52Updated 2 months ago
Alternatives and similar repositories for Awesome-Dataset-Reduction:
Users that are interested in Awesome-Dataset-Reduction are comparing it to the libraries listed below
- Code for our ICML'24 on multimodal dataset distillation☆36Updated 5 months ago
- ☆45Updated 4 months ago
- A collection of recent token reduction (token pruning, merging, clustering, etc.) techniques for ML/AI☆27Updated this week
- ☆50Updated last week
- ☆74Updated 7 months ago
- [CVPR2024 highlight] Generalized Large-Scale Data Condensation via Various Backbone and Statistical Matching (G-VBSM)☆27Updated 5 months ago
- CoT-Valve: Length-Compressible Chain-of-Thought Tuning☆55Updated last month
- ☆71Updated last week
- A tiny paper rating web☆35Updated last week
- Survey on Data-centric Large Language Models☆81Updated 8 months ago
- [ICML 2024 Oral] Official code repository for MLLM-as-a-Judge.☆65Updated last month
- A pytorch implementation of CVPR24 paper "D4M: Dataset Distillation via Disentangled Diffusion Model"☆28Updated 6 months ago
- Data distillation benchmark☆58Updated last week
- [ICLR 2025] Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality☆13Updated 2 weeks ago
- ☆107Updated last month
- Preview code of ECCV'24 paper "Distill Gold from Massive Ores" (BiLP)☆24Updated 8 months ago
- [CVPR 2024] On the Diversity and Realism of Distilled Dataset: An Efficient Dataset Distillation Paradigm☆64Updated last month
- Less is More: High-value Data Selection for Visual Instruction Tuning☆11Updated 2 months ago
- ☆28Updated 2 years ago
- 📚 Collection of token reduction for model compression resources.☆47Updated last month
- Representation Surgery for Multi-Task Model Merging. ICML, 2024.☆42Updated 5 months ago
- ☆15Updated 9 months ago
- Prioritize Alignment in Dataset Distillation☆20Updated 3 months ago
- a brief repo about paper research☆14Updated 6 months ago
- 关于LLM和Multimodal LLM的paper list☆29Updated this week
- Code for ICML2023 paper, DDGR: Continual Learning with Deep Diffusion-based Generative Replay.☆36Updated last year
- ZhiJian: A Unifying and Rapidly Deployable Toolbox for Pre-trained Model Reuse☆51Updated last year
- [CVPR 2025] Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key☆40Updated 2 weeks ago
- Data and Code for Paper VLSBench: Unveiling Visual Leakage in Multimodal Safety☆35Updated 2 weeks ago
- ICLR 2024, Towards Lossless Dataset Distillation via Difficulty-Aligned Trajectory Matching☆102Updated 10 months ago