Kennethborup / knowledgeDistillationLinks
PyTorch implementation of (Hinton) Knowledge Distillation and a base class for simple implementation of other distillation methods.
☆29Updated 4 years ago
Alternatives and similar repositories for knowledgeDistillation
Users that are interested in knowledgeDistillation are comparing it to the libraries listed below
Sorting:
- PyTorch, PyTorch Lightning framework for trying knowledge distillation in image classification problems☆32Updated last year
- several types of attention modules written in PyTorch for learning purposes☆52Updated last year
- Stochastic Weight Averaging Tutorials using pytorch.☆33Updated 4 years ago
- This is the public github for our paper "Transformer with a Mixture of Gaussian Keys"☆28Updated 3 years ago
- PyTorch implementation of Soft MoE by Google Brain in "From Sparse to Soft Mixtures of Experts" (https://arxiv.org/pdf/2308.00951.pdf)☆78Updated 2 years ago
- The implementation of VGG thesis is implemented under PyTorch framework☆35Updated 2 years ago
- A Python Package for Deep Imbalanced Learning☆57Updated 2 months ago
- This resposity maintains a collection of important papers on knowledge distillation (awesome-knowledge-distillation)).☆80Updated 7 months ago
- ZSKD with PyTorch☆31Updated 2 years ago
- Recycling diverse models☆45Updated 2 years ago
- Demonstration of transfer of knowledge and generalization with distillation☆55Updated 6 years ago
- ☆10Updated 3 years ago
- IJCAI 2021, "Comparing Kullback-Leibler Divergence and Mean Squared Error Loss in Knowledge Distillation"☆42Updated 2 years ago
- Implementation of TableFormer, Robust Transformer Modeling for Table-Text Encoding, in Pytorch☆39Updated 3 years ago
- ☆95Updated last year
- PyTorch implementation of moe, which stands for mixture of experts☆49Updated 4 years ago
- A regularized self-labeling approach to improve the generalization and robustness of fine-tuned models☆27Updated 3 years ago
- Skyformer: Remodel Self-Attention with Gaussian Kernel and Nystr\"om Method (NeurIPS 2021)☆63Updated 3 years ago
- AAAI 2021: Robustness of Accuracy Metric and its Inspirations in Learning with Noisy Labels☆23Updated 4 years ago
- [ICLR 2023] “ Layer Grafted Pre-training: Bridging Contrastive Learning And Masked Image Modeling For Better Representations”, Ziyu Jian…☆24Updated 2 years ago
- ☆19Updated 4 years ago
- ☆31Updated 5 months ago
- Official code for Group-Transformer (Scale down Transformer by Grouping Features for a Lightweight Character-level Language Model, COLING…☆27Updated 4 years ago
- A project to add scalable state-of-the-art out-of-distribution detection (open set recognition) support by changing two lines of code! Pe…☆79Updated 3 years ago
- AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers☆47Updated 2 years ago
- [ACL 2023] Code for paper “Tailoring Instructions to Student’s Learning Levels Boosts Knowledge Distillation”(https://arxiv.org/abs/2305.…☆38Updated 2 years ago
- Code for paper: “What Data Benefits My Classifier?” Enhancing Model Performance and Interpretability through Influence-Based Data Selecti…☆24Updated last year
- Official Implementation of Unweighted Data Subsampling via Influence Function - AAAI 2020☆64Updated 4 years ago
- Adversarial examples to the new ConvNeXt architecture☆20Updated 3 years ago
- Towards Understanding the Mixture-of-Experts Layer in Deep Learning☆31Updated last year