[NeurIPS 2023] ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer
☆30Dec 6, 2023Updated 2 years ago
Alternatives and similar repositories for ShiftAddViT
Users that are interested in ShiftAddViT are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ICML 2022] ShiftAddNAS: Hardware-Inspired Search for More Accurate and Efficient Neural Networks☆15May 18, 2022Updated 3 years ago
- An open-sourced PyTorch library for developing energy efficient multiplication-less models and applications.☆14Feb 3, 2025Updated last year
- [ECCV 2022] SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning☆20Jul 7, 2022Updated 3 years ago
- [ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models☆35Jun 12, 2024Updated last year
- SpInfer: Leveraging Low-Level Sparsity for Efficient Large Language Model Inference on GPUs☆63Mar 25, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- AFPQ code implementation☆23Nov 6, 2023Updated 2 years ago
- [NeurIPS 2020] ShiftAddNet: A Hardware-Inspired Deep Network☆74Nov 16, 2020Updated 5 years ago
- ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization☆112Oct 15, 2024Updated last year
- [TMLR] Official PyTorch implementation of paper "Quantization Variation: A New Perspective on Training Transformers with Low-Bit Precisio…☆49Sep 27, 2024Updated last year
- ☆16Dec 9, 2023Updated 2 years ago
- A framework to compare low-bit integer and float-point formats☆71Feb 6, 2026Updated 2 months ago
- This is a repository of Binary General Matrix Multiply (BGEMM) by customized CUDA kernel. Thank FP6-LLM for the wheels!☆19Aug 30, 2024Updated last year
- ☆11Jun 4, 2024Updated last year
- High Performance FP8 GEMM Kernels for SM89 and later GPUs.☆21Jan 24, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- The official code for [ECCV2020] "HALO: Hardware-aware Learning to Optimize"☆10Mar 22, 2023Updated 3 years ago
- The official implementation of the DAC 2024 paper GQA-LUT☆22Dec 20, 2024Updated last year
- BinaryViT: Pushing Binary Vision Transformers Towards Convolutional Models☆37Feb 4, 2024Updated 2 years ago
- [ICLR 2023] "Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers" by Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal…☆56Feb 28, 2023Updated 3 years ago
- [ACL 2024] A novel QAT with Self-Distillation framework to enhance ultra low-bit LLMs.☆136May 16, 2024Updated last year
- ☆29May 24, 2024Updated last year
- Model Quantization Benchmark☆18Mar 23, 2026Updated 2 weeks ago
- First Latency-Aware Competitive LLM Agent Benchmark☆26Jun 3, 2025Updated 10 months ago
- ☆18Sep 25, 2025Updated 6 months ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- The official implementation of the ICML 2023 paper OFQ-ViT☆39Oct 3, 2023Updated 2 years ago
- Code repo for the paper BiT Robustly Binarized Multi-distilled Transformer☆114Jun 26, 2023Updated 2 years ago
- [ICASSP'20] DNN-Chip Predictor: An Analytical Performance Predictor for DNN Accelerators with Various Dataflows and Hardware Architecture…☆25Oct 1, 2022Updated 3 years ago
- [HPCA 2023] ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design☆129Jun 27, 2023Updated 2 years ago
- [CVPR 2025 Highlight] FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation☆26Jun 16, 2025Updated 9 months ago
- [CVPR 2024] Official implementation for "A&B BNN: Add&Bit-Operation-Only Hardware-Friendly Binary Neural Network"☆24Dec 5, 2025Updated 4 months ago
- [CVPR 2023] PD-Quant: Post-Training Quantization Based on Prediction Difference Metric☆61Mar 23, 2023Updated 3 years ago
- The official implementation of the EMNLP 2023 paper LLM-FP4☆222Dec 15, 2023Updated 2 years ago
- ☆37Oct 10, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- [ICML 2021] "Double-Win Quant: Aggressively Winning Robustness of Quantized DeepNeural Networks via Random Precision Training and Inferen…☆16Feb 13, 2022Updated 4 years ago
- ☆16Nov 14, 2022Updated 3 years ago
- Training Quantized Neural Networks with a Full-precision Auxiliary Module☆13Jun 19, 2020Updated 5 years ago
- ☆10Apr 24, 2024Updated last year
- ☆16Nov 12, 2024Updated last year
- Tutorials of Extending and importing TVM with CMAKE Include dependency.☆16Oct 11, 2024Updated last year
- [EMNLP 2024] RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization☆39Sep 24, 2024Updated last year