A curated list of awesome papers on dataset reduction, including dataset distillation (dataset condensation) and dataset pruning (coreset selection).
β61Jan 14, 2025Updated last year
Alternatives and similar repositories for Awesome-Dataset-Reduction
Users that are interested in Awesome-Dataset-Reduction are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- (NeurIPS 2025 π₯) Official implementation for "Efficient Multi-modal Large Language Models via Progressive Consistency Distillation"β46Feb 11, 2026Updated last month
- Data distillation benchmarkβ72Jun 13, 2025Updated 9 months ago
- Exploiting Inter-sample and Inter-feature Relations in Dataset Distillation (CVPR24)β11Jun 16, 2024Updated last year
- An Easy and Unified Interface for Robots (and Grippers, etc.)β13Nov 7, 2024Updated last year
- [ICLR 2025 Spotlight] Code release for "Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late In Training"β18Feb 20, 2025Updated last year
- NordVPN Special Discount Offer β’ AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- Pytorch implementation of OCFGAN-GP (CVPR 2020, Oral).β15Apr 3, 2020Updated 5 years ago
- You Only Condense Once: Two Rules for Pruning Condensed Datasets (NeurIPS 2023)β15Nov 18, 2023Updated 2 years ago
- [ICML 2024] Code release for "On the Emergence of Cross-Task Linearity in Pretraining-Finetuning Paradigm"β11Feb 20, 2025Updated last year
- [RA-L 2025 & ICRA 2026] Motion Before Action: Diffusing Object Motion as Manipulation Conditionβ70Nov 4, 2025Updated 4 months ago
- [CVPR 2025] An Implementation of the paper "Pre-Instruction Data Selection for Visual Instruction Tuning"β17Jun 9, 2025Updated 9 months ago
- The code of the paper "Minimizing the Accumulated Trajectory Error to Improve Dataset Distillation" (CVPR2023)β40Mar 25, 2023Updated 3 years ago
- β17Jun 14, 2024Updated last year
- [NeurIPS 2023] Code release for "Going Beyond Linear Mode Connectivity: The Layerwise Linear Feature Connectivity"β19Oct 19, 2023Updated 2 years ago
- β44Oct 13, 2023Updated 2 years ago
- DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Prioritize Alignment in Dataset Distillationβ21Dec 3, 2024Updated last year
- [ICLR 2025] "Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond"β17Feb 27, 2025Updated last year
- (CVPR 2025) Official implementation to DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation which outperforms SOTAβ¦β27Aug 23, 2025Updated 7 months ago
- Official PyTorch implementation of "Loss-Curvature Matching for Dataset Selection and Condensation" (AISTATS 2023)β22Mar 14, 2023Updated 3 years ago
- β30Nov 5, 2024Updated last year
- GameVerse: Can Vision-Language Models Learn from Video-based Reflection?β44Mar 10, 2026Updated 2 weeks ago
- [ICRA 2025] CAGE: Causal Attention Enables Data-Efficient Generalizable Robotic Manipulationβ37Jan 14, 2025Updated last year
- Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cacheβ¦β200Nov 17, 2025Updated 4 months ago
- [NeurIPSβ2021] "MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the Edge", Geng Yuan, Xiaolong Ma, Yanzhi Wang et alβ¦β17Mar 16, 2022Updated 4 years ago
- DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- β29Sep 30, 2025Updated 5 months ago
- (Pattern Recognition 2025) Towards Trustworthy Dataset Distillationβ14Dec 8, 2024Updated last year
- Preview code of ECCV'24 paper "Distill Gold from Massive Ores" (BiLP)β25Jul 6, 2024Updated last year
- (ICLR 2026 π₯) Code for "The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs"β76Feb 9, 2026Updated last month
- torch.optim.lr_schedulerβ10Mar 17, 2020Updated 6 years ago
- A curated list of awesome papers on dataset distillation and related applications.β1,913Mar 20, 2026Updated last week
- Not All Patches Are Equal: Hierarchical Dataset Condensation for Single Image Super-Resolutionβ10May 7, 2024Updated last year
- β33Mar 6, 2026Updated 3 weeks ago
- The official implementation for paper: Vision-Language Models are Strong Noisy Label Detectorsβ16Mar 31, 2025Updated 11 months ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- [CVPR 2026] Variation-aware Vision Token Dropping for Faster Large Vision-Language Modelsβ28Mar 18, 2026Updated last week
- Source code for "Learning Deep Priors for Image Dehazing", ICCV 2019β10Sep 18, 2020Updated 5 years ago
- Official code of "ALIM: Adjusting Label Importance Mechanism for Noisy Partial Label Learning"β23Sep 25, 2023Updated 2 years ago
- This is the repository for paper `Learning Task-Aware Effective Brain Connectivity for fMRI Analysis with Graph Neural Networks'.β14Nov 22, 2023Updated 2 years ago
- (TPAMI 2026) Complementary Text-Guided Attention for Zero-Shot Adversarial Robustness & & (NeurIPS 2024) Text-Guided Attention is All Yβ¦β18Updated this week
- Welcome to the 'In Context Learning Theory' Reading Groupβ30Nov 8, 2024Updated last year
- EraseDiff: Erasing Data Influence in Diffusion Modelsβ14Nov 20, 2024Updated last year