[NeurIPS 2023] ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer
☆30Dec 6, 2023Updated 2 years ago
Alternatives and similar repositories for ShiftAddViT
Users that are interested in ShiftAddViT are comparing it to the libraries listed below
Sorting:
- [ICML 2022] ShiftAddNAS: Hardware-Inspired Search for More Accurate and Efficient Neural Networks☆15May 18, 2022Updated 3 years ago
- An open-sourced PyTorch library for developing energy efficient multiplication-less models and applications.☆14Feb 3, 2025Updated last year
- [ECCV 2022] SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning☆20Jul 7, 2022Updated 3 years ago
- [ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models☆35Jun 12, 2024Updated last year
- SpInfer: Leveraging Low-Level Sparsity for Efficient Large Language Model Inference on GPUs☆62Mar 25, 2025Updated 11 months ago
- AFPQ code implementation☆23Nov 6, 2023Updated 2 years ago
- [NeurIPS 2020] ShiftAddNet: A Hardware-Inspired Deep Network☆74Nov 16, 2020Updated 5 years ago
- ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization☆112Oct 15, 2024Updated last year
- [TMLR] Official PyTorch implementation of paper "Quantization Variation: A New Perspective on Training Transformers with Low-Bit Precisio…☆49Sep 27, 2024Updated last year
- ☆16Dec 9, 2023Updated 2 years ago
- A framework to compare low-bit integer and float-point formats☆71Feb 6, 2026Updated last month
- This is a repository of Binary General Matrix Multiply (BGEMM) by customized CUDA kernel. Thank FP6-LLM for the wheels!☆18Aug 30, 2024Updated last year
- The code of SpikingSSMs: Learning Long Sequences with Sparse and Parallel Spiking State Space Models☆22Apr 16, 2025Updated 11 months ago
- ☆10Jun 4, 2024Updated last year
- High Performance FP8 GEMM Kernels for SM89 and later GPUs.☆20Jan 24, 2025Updated last year
- The official code for [ECCV2020] "HALO: Hardware-aware Learning to Optimize"☆10Mar 22, 2023Updated 3 years ago
- The official implementation of the DAC 2024 paper GQA-LUT☆21Dec 20, 2024Updated last year
- BinaryViT: Pushing Binary Vision Transformers Towards Convolutional Models☆37Feb 4, 2024Updated 2 years ago
- [ACL 2024] A novel QAT with Self-Distillation framework to enhance ultra low-bit LLMs.☆134May 16, 2024Updated last year
- [ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…☆56Feb 28, 2023Updated 3 years ago
- ☆29May 24, 2024Updated last year
- First Latency-Aware Competitive LLM Agent Benchmark☆26Jun 3, 2025Updated 9 months ago
- ☆26Dec 1, 2016Updated 9 years ago
- ☆18Sep 25, 2025Updated 5 months ago
- The official implementation of the ICML 2023 paper OFQ-ViT☆39Oct 3, 2023Updated 2 years ago
- Code repo for the paper BiT Robustly Binarized Multi-distilled Transformer☆114Jun 26, 2023Updated 2 years ago
- [ICASSP'20] DNN-Chip Predictor: An Analytical Performance Predictor for DNN Accelerators with Various Dataflows and Hardware Architecture…☆25Oct 1, 2022Updated 3 years ago
- [HPCA 2023] ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design☆130Jun 27, 2023Updated 2 years ago
- [CVPR 2025 Highlight] FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation☆25Jun 16, 2025Updated 9 months ago
- Code for ViTAS_Vision Transformer Architecture Search☆51Jul 22, 2021Updated 4 years ago
- [CVPR 2024] Official implementation for "A&B BNN: Add&Bit-Operation-Only Hardware-Friendly Binary Neural Network"☆24Dec 5, 2025Updated 3 months ago
- [CVPR 2023] PD-Quant: Post-Training Quantization Based on Prediction Difference Metric☆60Mar 23, 2023Updated 2 years ago
- The official implementation of the EMNLP 2023 paper LLM-FP4☆222Dec 15, 2023Updated 2 years ago
- ☆36Oct 10, 2024Updated last year
- [ICML 2021] "Double-Win Quant: Aggressively Winning Robustness of Quantized DeepNeural Networks via Random Precision Training and Inferen…☆16Feb 13, 2022Updated 4 years ago
- ☆16Nov 14, 2022Updated 3 years ago
- Training Quantized Neural Networks with a Full-precision Auxiliary Module☆13Jun 19, 2020Updated 5 years ago
- ☆16Nov 12, 2024Updated last year
- ☆10Apr 24, 2024Updated last year