zju-SWJ / RLDLinks
Official implementation for "Knowledge Distillation with Refined Logits".
☆14Updated 10 months ago
Alternatives and similar repositories for RLD
Users that are interested in RLD are comparing it to the libraries listed below
Sorting:
- Code for 'Multi-level Logit Distillation' (CVPR2023)☆66Updated 9 months ago
- CVPR 2023, Class Attention Transfer Based Knowledge Distillation☆44Updated 2 years ago
- ☆26Updated last year
- The official project website of "NORM: Knowledge Distillation via N-to-One Representation Matching" (The paper of NORM is published in IC…☆20Updated last year
- BESA is a differentiable weight pruning technique for large language models.☆17Updated last year
- The official implementation for MTLoRA: A Low-Rank Adaptation Approach for Efficient Multi-Task Learning (CVPR '24)☆55Updated last week
- This is the official code for paper: Token Summarisation for Efficient Vision Transformers via Graph-based Token Propagation☆29Updated last year
- Learning Efficient Vision Transformers via Fine-Grained Manifold Distillation. NeurIPS 2022.☆32Updated 2 years ago
- The official implementation of LumiNet: The Bright Side of Perceptual Knowledge Distillation https://arxiv.org/abs/2310.03669☆19Updated last year
- The codebase for paper "PPT: Token Pruning and Pooling for Efficient Vision Transformer"☆24Updated 7 months ago
- The offical implementation of [NeurIPS2024] Wasserstein Distance Rivals Kullback-Leibler Divergence for Knowledge Distillation https://ar…☆42Updated 7 months ago
- 🔥MixPro: Data Augmentation with MaskMix and Progressive Attention Labeling for Vision Transformer [Official, ICLR 2023]☆21Updated last year
- [NeurIPS 2024] Search for Efficient LLMs☆14Updated 6 months ago
- a training-free approach to accelerate ViTs and VLMs by pruning redundant tokens based on similarity☆29Updated last month
- [BMVC 2022] Information Theoretic Representation Distillation☆18Updated last year
- This repo contains the source code for VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks (NeurIPS 2024).☆39Updated 9 months ago
- [CVPR 2023] Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention During Vision Transformer Inference☆30Updated last year
- Official code for Scale Decoupled Distillation☆42Updated last year
- Training ImageNet / CIFAR models with sota strategies and fancy techniques such as ViT, KD, Rep, etc.☆82Updated last year
- [NeurIPS'22] Projector Ensemble Feature Distillation☆29Updated last year
- Official pytorch implementation of NeurIPS 2022 paper, TokenMixup☆48Updated 2 years ago
- [2025] Efficient Vision Language Models: A Survey☆20Updated 2 weeks ago
- ☆46Updated last year
- ☆27Updated 2 years ago
- [AAAI 2023] Official PyTorch Code for "Curriculum Temperature for Knowledge Distillation"☆176Updated 7 months ago
- [Preprint] Why is the State of Neural Network Pruning so Confusing? On the Fairness, Comparison Setup, and Trainability in Network Prunin…☆40Updated 2 years ago
- ☆39Updated 11 months ago
- ☆34Updated last year
- Multi-head Recurrent Layer Attention for Vision Network☆19Updated 2 years ago
- The code of the paper "Minimizing the Accumulated Trajectory Error to Improve Dataset Distillation" (CVPR2023)☆39Updated 2 years ago