[NeurIPS 2023] ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer
☆30Dec 6, 2023Updated 2 years ago
Alternatives and similar repositories for ShiftAddViT
Users that are interested in ShiftAddViT are comparing it to the libraries listed below
Sorting:
- [ICML 2022] ShiftAddNAS: Hardware-Inspired Search for More Accurate and Efficient Neural Networks☆15May 18, 2022Updated 3 years ago
- [ECCV 2022] SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning☆20Jul 7, 2022Updated 3 years ago
- [ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models☆35Jun 12, 2024Updated last year
- SpInfer: Leveraging Low-Level Sparsity for Efficient Large Language Model Inference on GPUs☆60Mar 25, 2025Updated 11 months ago
- ☆16Dec 9, 2023Updated 2 years ago
- AFPQ code implementation☆23Nov 6, 2023Updated 2 years ago
- [NeurIPS 2020] ShiftAddNet: A Hardware-Inspired Deep Network☆74Nov 16, 2020Updated 5 years ago
- ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization☆112Oct 15, 2024Updated last year
- Model Quantization Benchmark☆18Sep 30, 2025Updated 5 months ago
- [TMLR] Official PyTorch implementation of paper "Quantization Variation: A New Perspective on Training Transformers with Low-Bit Precisio…☆48Sep 27, 2024Updated last year
- High Performance FP8 GEMM Kernels for SM89 and later GPUs.☆20Jan 24, 2025Updated last year
- This is a repository of Binary General Matrix Multiply (BGEMM) by customized CUDA kernel. Thank FP6-LLM for the wheels!☆18Aug 30, 2024Updated last year
- The official implementation of the DAC 2024 paper GQA-LUT☆20Dec 20, 2024Updated last year
- ☆18Sep 25, 2025Updated 5 months ago
- The official implementation of the ICML 2023 paper OFQ-ViT☆39Oct 3, 2023Updated 2 years ago
- BinaryViT: Pushing Binary Vision Transformers Towards Convolutional Models☆37Feb 4, 2024Updated 2 years ago
- [CVPR 2024] Official implementation for "A&B BNN: Add&Bit-Operation-Only Hardware-Friendly Binary Neural Network"☆24Dec 5, 2025Updated 2 months ago
- [ACL 2024] A novel QAT with Self-Distillation framework to enhance ultra low-bit LLMs.☆134May 16, 2024Updated last year
- [CVPR 2025 Highlight] FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation☆26Jun 16, 2025Updated 8 months ago
- ☆10Apr 24, 2024Updated last year
- A framework to compare low-bit integer and float-point formats☆66Feb 6, 2026Updated 3 weeks ago
- Training Quantized Neural Networks with a Full-precision Auxiliary Module☆13Jun 19, 2020Updated 5 years ago
- The official code for [ECCV2020] "HALO: Hardware-aware Learning to Optimize"☆10Mar 22, 2023Updated 2 years ago
- Differentiable Weightless Neural Networks☆33Feb 2, 2026Updated 3 weeks ago
- ☆11Jan 10, 2025Updated last year
- First Latency-Aware Competitive LLM Agent Benchmark☆26Jun 3, 2025Updated 8 months ago
- Global Sparse Momentum SGD for pruning very deep neural networks☆44Sep 8, 2022Updated 3 years ago
- ☆11Feb 24, 2025Updated last year
- [ICASSP'20] DNN-Chip Predictor: An Analytical Performance Predictor for DNN Accelerators with Various Dataflows and Hardware Architecture…☆25Oct 1, 2022Updated 3 years ago
- ☆29May 24, 2024Updated last year
- Neural Network Quantization With Fractional Bit-widths☆11Feb 19, 2021Updated 5 years ago
- FastCache: Fast Caching for Diffusion Transformer Through Learnable Linear Approximation [Efficient ML Model]☆46Feb 17, 2026Updated last week
- The official implementation of "NAS-BNN: Neural Architecture Search for Binary Neural Networks"☆13Aug 30, 2024Updated last year
- Code for ViTAS_Vision Transformer Architecture Search☆51Jul 22, 2021Updated 4 years ago
- The official implementation of the EMNLP 2023 paper LLM-FP4☆220Dec 15, 2023Updated 2 years ago
- [HPCA 2023] ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design☆128Jun 27, 2023Updated 2 years ago
- [ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…☆56Feb 28, 2023Updated 3 years ago
- Official implementation for ECCV 2022 paper LIMPQ - "Mixed-Precision Neural Network Quantization via Learned Layer-wise Importance"☆61Mar 19, 2023Updated 2 years ago
- FFNet: MetaMixer-based Efficient Convolutional Mixer Design☆31Mar 11, 2025Updated 11 months ago