A curated list of awesome papers on dataset reduction, including dataset distillation (dataset condensation) and dataset pruning (coreset selection).
β62Jan 14, 2025Updated last year
Alternatives and similar repositories for Awesome-Dataset-Reduction
Users that are interested in Awesome-Dataset-Reduction are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- (NeurIPS 2025 π₯) Official implementation for "Efficient Multi-modal Large Language Models via Progressive Consistency Distillation"β50Feb 11, 2026Updated 4 months ago
- Data distillation benchmarkβ72Jun 13, 2025Updated last year
- Exploiting Inter-sample and Inter-feature Relations in Dataset Distillation (CVPR24)β10Jun 16, 2024Updated 2 years ago
- Official PyTorch implementation of the paper "Dataset Distillation with Neural Characteristic Function: A Minmax Perspective" (NCFM) in Cβ¦β413Jun 3, 2026Updated 2 weeks ago
- [ICLR 2025 Spotlight] Code release for "Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late In Training"β19Feb 20, 2025Updated last year
- Open source password manager - Proton Pass β’ AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- An Easy and Unified Interface for Robots (and Grippers, etc.)β14Nov 7, 2024Updated last year
- Pytorch implementation of OCFGAN-GP (CVPR 2020, Oral).β15Apr 3, 2020Updated 6 years ago
- You Only Condense Once: Two Rules for Pruning Condensed Datasets (NeurIPS 2023)β16Nov 18, 2023Updated 2 years ago
- [CVPR 2025] An Implementation of the paper "Pre-Instruction Data Selection for Visual Instruction Tuning"β17Jun 9, 2025Updated last year
- [RA-L 2025 & ICRA 2026] Motion Before Action: Diffusing Object Motion as Manipulation Conditionβ72Nov 4, 2025Updated 7 months ago
- The code of the paper "Minimizing the Accumulated Trajectory Error to Improve Dataset Distillation" (CVPR2023)β40Mar 25, 2023Updated 3 years ago
- [NeurIPS 2023] Code release for "Going Beyond Linear Mode Connectivity: The Layerwise Linear Feature Connectivity"β19Oct 19, 2023Updated 2 years ago
- [ICLR 2025] "Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond"β16Feb 27, 2025Updated last year
- [EMNLP 2025 main π₯] Code for "Stop Looking for Important Tokens in Multimodal Language Models: Duplication Matters More"β120Oct 12, 2025Updated 8 months ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Official PyTorch implementation of "Loss-Curvature Matching for Dataset Selection and Condensation" (AISTATS 2023)β22Mar 14, 2023Updated 3 years ago
- β30Nov 5, 2024Updated last year
- [IROS 2025] SIME: Enhancing Policy Self-Improvement with Modal-level Explorationβ17Mar 2, 2026Updated 3 months ago
- GameVerse: Can Vision-Language Models Learn from Video-based Reflection?β50Mar 26, 2026Updated 2 months ago
- Let you in a meta world of The Palace Museumβ23Aug 30, 2025Updated 9 months ago
- [ICRA 2025] CAGE: Causal Attention Enables Data-Efficient Generalizable Robotic Manipulationβ36Jan 14, 2025Updated last year
- Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cacheβ¦β206May 1, 2026Updated last month
- [NeurIPSβ2021] "MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the Edge", Geng Yuan, Xiaolong Ma, Yanzhi Wang et alβ¦β17Mar 16, 2022Updated 4 years ago
- Semi-Supervised Learning for Visual Birdβs Eye View Semantic Segmentationβ15Feb 27, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer β’ AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- NeurIPS 2020 Spotlight Paperβ13Dec 20, 2021Updated 4 years ago
- [RAL 2022] S2G2: Semi-Supervised Semantic Bird-Eye-View Grid-Map Generation Using a Monocular Camera for Autonomous Drivingβ11Nov 23, 2022Updated 3 years ago
- β30May 17, 2026Updated last month
- [CVPR2024 highlight] Generalized Large-Scale Data Condensation via Various Backbone and Statistical Matching (G-VBSM)β27Oct 9, 2024Updated last year
- This repository is the official implementation of Dataset Condensation with Contrastive Signals (DCC), accepted at ICML 2022.β22Jun 8, 2022Updated 4 years ago
- (Pattern Recognition 2025) Towards Trustworthy Dataset Distillationβ14Dec 8, 2024Updated last year
- Official codebase for our NeurIPS paper, Symmetry-Informed Governing Equation Discovery.β11Nov 13, 2024Updated last year
- Preview code of ECCV'24 paper "Distill Gold from Massive Ores" (BiLP)β25Jul 6, 2024Updated last year
- (ICLR 2026 π₯) Code for "The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs"β79Feb 9, 2026Updated 4 months ago
- 1-Click AI Models by DigitalOcean Gradient β’ AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- This repository is a pytorch implementation of interpretable compositional convolutional neural networks.β22May 24, 2023Updated 3 years ago
- torch.optim.lr_schedulerβ10Mar 17, 2020Updated 6 years ago
- A curated list of awesome papers on dataset distillation and related applications.β1,944Jun 12, 2026Updated last week
- Not All Patches Are Equal: Hierarchical Dataset Condensation for Single Image Super-Resolutionβ10May 7, 2024Updated 2 years ago
- Implementation of Mutan+ArticleNet on OKVQAβ10Jan 11, 2021Updated 5 years ago
- [IROS 2024] π RISE: 3D Perception Makes Real-World Robot Imitation Simple and Effectiveβ154Nov 29, 2025Updated 6 months ago
- [2026 CVPR]Extending One-Step Image Generation from Class Labels to Text via Discriminative Text Representationβ111Apr 15, 2026Updated 2 months ago