snudm-starlab / SensiMixView external linksLinks
SensiMix: Sensitivity-Aware 8-bit Index & 1-bit Value Mixed Precision Quantization for BERT Compression (PLOS One)
☆34Aug 22, 2025Updated 5 months ago
Alternatives and similar repositories for SensiMix
Users that are interested in SensiMix are comparing it to the libraries listed below
Sorting:
- Pea-KD: Parameter-efficient and accurate knowledge distillation on BERT (PLOS One)☆35Aug 22, 2025Updated 5 months ago
- Sturctured pruning algorithm for pruning Transformer☆31Dec 6, 2023Updated 2 years ago
- PET: Parameter-efficient Knowledge Distillation on Transformer (PLOS One)☆15Aug 22, 2025Updated 5 months ago
- Model-Agnostic Augmentation for Accurate Graph Classification (WWW 2022)☆21Aug 22, 2025Updated 5 months ago
- A dataset repository of "Accurate Action Recommendation for Smart Home via Two-Level Encoders and Commonsense Knowledge" (CIKM 2022)☆16Aug 20, 2025Updated 5 months ago
- Fast and Accurate Partial Fourier Transform for Time Series Data (KDD 2021)☆16Aug 19, 2025Updated 5 months ago
- Official implementation of the ICLR paper "Streamlining Redundant Layers to Compress Large Language Models"☆40May 1, 2025Updated 9 months ago
- Implementation of LaViC (KDD 2025)☆13Jun 1, 2025Updated 8 months ago
- ☆12Oct 9, 2023Updated 2 years ago
- 커버리스트 - 북 커버 생성 AI 서비스☆13Sep 11, 2022Updated 3 years ago
- Code for paper: Unraveling the Shift of Visual Information Flow in MLLMs: From Phased Interaction to Efficient Inference☆13Jun 7, 2025Updated 8 months ago
- ☆14Oct 6, 2023Updated 2 years ago
- 한국어 생성 문서의 원소 사실 관계에 대한 설명 기술☆17Dec 16, 2024Updated last year
- Structured Pruning Adapters in PyTorch☆19Aug 30, 2023Updated 2 years ago
- An introductory graduate course on "Computer Vision for Embedded Systems"☆20Jun 12, 2022Updated 3 years ago
- 밑바닥부터 시작하는 딥러닝 정리☆88Mar 15, 2019Updated 6 years ago
- torch_quantizer is a out-of-box quantization tool for PyTorch models on CUDA backend, specially optimized for Diffusion Models.☆23Mar 29, 2024Updated last year
- ☆29Nov 10, 2024Updated last year
- A PyTorch implemenation of real XNOR-popcount (1-bit op) GEMM Linear PyTorch extension support both CPU and CUDA☆24Jun 6, 2023Updated 2 years ago
- [AAAI 2025] HiRED strategically drops visual tokens in the image encoding stage to improve inference efficiency for High-Resolution Visio…☆44Apr 18, 2025Updated 9 months ago
- [CVPR 2025] DivPrune: Diversity-based Visual Token Pruning for Large Multimodal Models☆65Dec 1, 2025Updated 2 months ago
- [ICCV 2025] Official code for "AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning"☆55Oct 9, 2025Updated 4 months ago
- [ICCV 2025] SparseMM: Head Sparsity Emerges from Visual Concept Responses in MLLMs☆82Jan 17, 2026Updated last month
- ☆64Jan 23, 2026Updated 3 weeks ago
- [ECCV 2024] Efficient Inference of Vision Instruction-Following Models with Elastic Cache☆43Jul 26, 2024Updated last year
- Prune transformer layers☆74May 30, 2024Updated last year
- Context Modeling with Speaker's Pre-trained Memory Tracking for Emotion Recognition in Conversation (NAACL 2022)☆65Mar 17, 2023Updated 2 years ago
- ☆59Nov 15, 2024Updated last year
- A paper list about Token Merge, Reduce, Resample, Drop for MLLMs.☆86Oct 26, 2025Updated 3 months ago
- 추천시스템 논문을 읽고 구현한 Code가 저장된 Repository☆66Feb 26, 2023Updated 2 years ago
- Open-source Library PyGDebias: Graph Datasets and Fairness-Aware Graph Mining Algorithms☆65May 7, 2024Updated last year
- [ACL2025 Findings] Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models☆90May 20, 2025Updated 8 months ago
- 한글 텍스트 임베딩 모델 리더보드☆93Oct 22, 2024Updated last year
- Official code for paper: [CLS] Attention is All You Need for Training-Free Visual Token Pruning: Make VLM Inference Faster.☆106Jun 29, 2025Updated 7 months ago
- The official implementation of PTQD: Accurate Post-Training Quantization for Diffusion Models☆103Mar 12, 2024Updated last year
- 국립국어원 사전 / FOSS Korean dictionary by National Institute of Korean Language☆111Jan 1, 2026Updated last month
- SUMBT: Slot-Utterance Matching for Universal and Scalable Belief Tracking (ACL 2019)☆90May 3, 2024Updated last year
- ☆101Oct 12, 2023Updated 2 years ago
- QUICK: Quantization-aware Interleaving and Conflict-free Kernel for efficient LLM inference☆120Mar 6, 2024Updated last year