A curated list of awesome papers on dataset reduction, including dataset distillation (dataset condensation) and dataset pruning (coreset selection).
☆61Jan 14, 2025Updated last year
Alternatives and similar repositories for Awesome-Dataset-Reduction
Users that are interested in Awesome-Dataset-Reduction are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- (NeurIPS 2025 🔥) Official implementation for "Efficient Multi-modal Large Language Models via Progressive Consistency Distillation"☆48Feb 11, 2026Updated 2 months ago
- Data distillation benchmark☆72Jun 13, 2025Updated 10 months ago
- Official PyTorch code for ICLR 2025 paper "Gnothi Seauton: Empowering Faithful Self-Interpretability in Black-Box Models"☆23Mar 4, 2025Updated last year
- An Easy and Unified Interface for Robots (and Grippers, etc.)☆13Nov 7, 2024Updated last year
- [ICLR 2025 Spotlight] Code release for "Sharpness-Aware Minimization Efficiently Selects Flatter Minima Late In Training"☆18Feb 20, 2025Updated last year
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Pytorch implementation of OCFGAN-GP (CVPR 2020, Oral).☆15Apr 3, 2020Updated 6 years ago
- You Only Condense Once: Two Rules for Pruning Condensed Datasets (NeurIPS 2023)☆15Nov 18, 2023Updated 2 years ago
- [CVPR 2025] An Implementation of the paper "Pre-Instruction Data Selection for Visual Instruction Tuning"☆17Jun 9, 2025Updated 10 months ago
- [RA-L 2025 & ICRA 2026 Oral] Motion Before Action: Diffusing Object Motion as Manipulation Condition☆71Nov 4, 2025Updated 5 months ago
- The code of the paper "Minimizing the Accumulated Trajectory Error to Improve Dataset Distillation" (CVPR2023)☆40Mar 25, 2023Updated 3 years ago
- Socratic-Zero is a fully autonomous framework that generates high-quality training data for mathematical reasoning☆36Oct 26, 2025Updated 5 months ago
- [NeurIPS 2023] Code release for "Going Beyond Linear Mode Connectivity: The Layerwise Linear Feature Connectivity"☆19Oct 19, 2023Updated 2 years ago
- ☆44Oct 13, 2023Updated 2 years ago
- [ICLR 2025] "Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond"☆17Feb 27, 2025Updated last year
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- 本项目设计一个可以产生21种音阶的电子琴,由PS2键盘完成输入,在Basys2板识别处理后,产生特定频率声音,最后通过Pmod_AMP模块发出。☆10Jul 21, 2019Updated 6 years ago
- 基于聚类算法Kmeans和HAC进行的图像分割实验。☆10Sep 8, 2021Updated 4 years ago
- (CVPR 2025) Official implementation to DELT: A Simple Diversity-driven EarlyLate Training for Dataset Distillation which outperforms SOTA…☆27Aug 23, 2025Updated 7 months ago
- [IROS 2025] SIME: Enhancing Policy Self-Improvement with Modal-level Exploration☆17Mar 2, 2026Updated last month
- GameVerse: Can Vision-Language Models Learn from Video-based Reflection?☆45Mar 26, 2026Updated 3 weeks ago
- [ICRA 2025] CAGE: Causal Attention Enables Data-Efficient Generalizable Robotic Manipulation☆37Jan 14, 2025Updated last year
- Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache…☆201Nov 17, 2025Updated 4 months ago
- 本项目主要是2025届浙江大学软件学院夏令营(AI营)的考核项目☆12Mar 3, 2025Updated last year
- [NeurIPS‘2021] "MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the Edge", Geng Yuan, Xiaolong Ma, Yanzhi Wang et al…☆17Mar 16, 2022Updated 4 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆29Apr 6, 2026Updated last week
- (Pattern Recognition 2025) Towards Trustworthy Dataset Distillation☆14Dec 8, 2024Updated last year
- 分别使用KNN和SVM在CIFAR10数据集上进行物体分类任务,后续加入了HOG特征提取对图像做预处理,提高SVM的分类性能。☆15Sep 8, 2021Updated 4 years ago
- Preview code of ECCV'24 paper "Distill Gold from Massive Ores" (BiLP)☆25Jul 6, 2024Updated last year
- 🏂 World Guidance: World Modeling in Condition Space for Action Generation☆68Mar 24, 2026Updated 3 weeks ago
- several examples of the learning of the java☆11Nov 22, 2023Updated 2 years ago
- ☆29Jun 12, 2023Updated 2 years ago
- torch.optim.lr_scheduler☆10Mar 17, 2020Updated 6 years ago
- This repository is a pytorch implementation of interpretable compositional convolutional neural networks.☆22May 24, 2023Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆11Jul 14, 2023Updated 2 years ago
- A curated list of awesome papers on dataset distillation and related applications.☆1,921Apr 8, 2026Updated last week
- [ICCV 2023] DataDAM: Efficient Dataset Distillation with Attention Matching☆34Jun 20, 2024Updated last year
- The official implementation for paper: Vision-Language Models are Strong Noisy Label Detectors☆17Mar 31, 2025Updated last year
- Source code for "Learning Deep Priors for Image Dehazing", ICCV 2019☆10Sep 18, 2020Updated 5 years ago
- Official code of "ALIM: Adjusting Label Importance Mechanism for Noisy Partial Label Learning"☆23Sep 25, 2023Updated 2 years ago
- ☆11Jul 30, 2025Updated 8 months ago