Efficient GPU kernels for mixed-precision Vision Transformers in Triton
☆17Sep 18, 2025Updated 6 months ago
Alternatives and similar repositories for qattn
Users that are interested in qattn are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- code for the paper "A Statistical Framework for Low-bitwidth Training of Deep Neural Networks"☆29Oct 31, 2020Updated 5 years ago
- DAC System Design Contest 2020☆29Jun 11, 2020Updated 5 years ago
- BitSplit Post-trining Quantization☆50Dec 20, 2021Updated 4 years ago
- [ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…☆31Mar 12, 2024Updated 2 years ago
- [CVPR'20] ZeroQ Mixed-Precision implementation (unofficial): A Novel Zero Shot Quantization Framework☆14Dec 16, 2020Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Optimizing Deep Convolutional Neural Network with Ternarized Weights and High Accuracy☆16Jan 27, 2019Updated 7 years ago
- This is the pytorch implementation for the paper: Generalizable Mixed-Precision Quantization via Attribution Rank Preservation, which is…☆24Aug 17, 2021Updated 4 years ago
- ☆28Nov 5, 2021Updated 4 years ago
- ☆35Mar 4, 2020Updated 6 years ago
- An open-sourced PyTorch library for developing energy efficient multiplication-less models and applications.☆14Feb 3, 2025Updated last year
- [ICML 2022] ShiftAddNAS: Hardware-Inspired Search for More Accurate and Efficient Neural Networks☆15May 18, 2022Updated 3 years ago
- The code for Joint Neural Architecture Search and Quantization☆14Apr 10, 2019Updated 7 years ago
- [CVPR 2023] PD-Quant: Post-Training Quantization Based on Prediction Difference Metric☆61Mar 23, 2023Updated 3 years ago
- The official implementation of "NAS-BNN: Neural Architecture Search for Binary Neural Networks"☆13Aug 30, 2024Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- PyTorch implementation of "Deep Transferring Quantization" (ECCV2020)☆18Jun 22, 2022Updated 3 years ago
- [ECCV 2022] SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning☆20Jul 7, 2022Updated 3 years ago
- Pytorch implementation for FAT: learning low-bitwidth parametric representation via frequency-aware transformation☆67May 2, 2021Updated 4 years ago
- ☆17Jun 13, 2022Updated 3 years ago
- DNN quantization with outlier channel splitting (ICML'19)☆114Mar 21, 2020Updated 6 years ago
- Revisiting Parameter Sharing for Automatic Neural Channel Number Search, NeurIPS 2020☆21Nov 15, 2020Updated 5 years ago
- Collections of model quantization algorithms. Any issues, please contact Peng Chen (blueardour@gmail.com)☆73Oct 7, 2021Updated 4 years ago
- An Tensorflow.keras implementation of Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorizatio…☆10Dec 18, 2019Updated 6 years ago
- [ECCV 2024] CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs☆18Jul 2, 2024Updated last year
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Learning Accurate Decision Trees with Bandit Feedback via Quantized Gradient Descent☆16Sep 8, 2022Updated 3 years ago
- Code for our paper at ECCV 2020: Post-Training Piecewise Linear Quantization for Deep Neural Networks☆68Nov 4, 2021Updated 4 years ago
- Repository for the QUIK project, enabling the use of 4bit kernels for generative inference - EMNLP 2024☆185Apr 16, 2024Updated last year
- Code needed to reproduce results from my ICLR 2019 paper on fixed-point quantization of the backprop algorithm.☆10Jan 24, 2019Updated 7 years ago
- PyTorch code for full quantization of DNN using BCGD☆14Jul 24, 2019Updated 6 years ago
- ☆19Mar 16, 2022Updated 4 years ago
- source code of the paper: Robust Quantization: One Model to Rule Them All☆41Mar 24, 2023Updated 3 years ago
- ☆16Oct 29, 2021Updated 4 years ago
- Keras implementation of YOLOv2 refer to Andrew Ng☆11Feb 14, 2018Updated 8 years ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- The official implementation of the ICML 2023 paper OFQ-ViT☆39Oct 3, 2023Updated 2 years ago
- ☆15Oct 12, 2020Updated 5 years ago
- KimiaPath24: Dataset for retrieval and classification in digital pathology☆13Jun 4, 2017Updated 8 years ago
- A selective knowledge distillation algorithm for efficient speculative decoders☆37Nov 27, 2025Updated 4 months ago
- ☆23Oct 7, 2021Updated 4 years ago
- [NeurIPS 2020] ShiftAddNet: A Hardware-Inspired Deep Network☆74Nov 16, 2020Updated 5 years ago
- Code for High-Capacity Expert Binary Networks (ICLR 2021).☆27Dec 3, 2021Updated 4 years ago