Efficient GPU kernels for mixed-precision Vision Transformers in Triton
☆17Sep 18, 2025Updated 8 months ago
Alternatives and similar repositories for qattn
Users that are interested in qattn are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- code for the paper "A Statistical Framework for Low-bitwidth Training of Deep Neural Networks"☆29Oct 31, 2020Updated 5 years ago
- DAC System Design Contest 2020☆29Jun 11, 2020Updated 6 years ago
- BitSplit Post-trining Quantization☆49Dec 20, 2021Updated 4 years ago
- [ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…☆31Mar 12, 2024Updated 2 years ago
- [CVPR'20] ZeroQ Mixed-Precision implementation (unofficial): A Novel Zero Shot Quantization Framework☆14Dec 16, 2020Updated 5 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Optimizing Deep Convolutional Neural Network with Ternarized Weights and High Accuracy☆16Jan 27, 2019Updated 7 years ago
- This is the pytorch implementation for the paper: Generalizable Mixed-Precision Quantization via Attribution Rank Preservation, which is…☆24Aug 17, 2021Updated 4 years ago
- ☆28Nov 5, 2021Updated 4 years ago
- ☆35Mar 4, 2020Updated 6 years ago
- An open-sourced PyTorch library for developing energy efficient multiplication-less models and applications.☆14Feb 3, 2025Updated last year
- [ICML 2022] ShiftAddNAS: Hardware-Inspired Search for More Accurate and Efficient Neural Networks☆15May 18, 2022Updated 4 years ago
- Speedup the attention computation of Swin Transformer☆32Jun 14, 2025Updated 11 months ago
- The code for Joint Neural Architecture Search and Quantization☆14Apr 10, 2019Updated 7 years ago
- [CVPR 2023] PD-Quant: Post-Training Quantization Based on Prediction Difference Metric☆61Mar 23, 2023Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- A Maven plugin for protecting against backwards incompatible changes to your gRPC .proto files.☆13Jun 2, 2026Updated last week
- The official implementation of "NAS-BNN: Neural Architecture Search for Binary Neural Networks"☆14Aug 30, 2024Updated last year
- PyTorch implementation of "Deep Transferring Quantization" (ECCV2020)☆18Jun 22, 2022Updated 3 years ago
- [ECCV 2022] SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning☆20Jul 7, 2022Updated 3 years ago
- [PR 2024] HTQ: Exploring the High-Dimensional Trade-Off of Mixed-Precision Quantization☆12Jul 16, 2024Updated last year
- This repository teaches how to train, evaluate and deploy ML models using MLFlow☆13Oct 23, 2024Updated last year
- ☆17Jun 13, 2022Updated 3 years ago
- [TMLR] Official PyTorch implementation of paper "Efficient Quantization-aware Training with Adaptive Coreset Selection"☆39Aug 20, 2024Updated last year
- DNN quantization with outlier channel splitting (ICML'19)☆114Mar 21, 2020Updated 6 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Revisiting Parameter Sharing for Automatic Neural Channel Number Search, NeurIPS 2020☆21Nov 15, 2020Updated 5 years ago
- Collections of model quantization algorithms. Any issues, please contact Peng Chen (blueardour@gmail.com)☆73Oct 7, 2021Updated 4 years ago
- Pytorch implementation for FAT: learning low-bitwidth parametric representation via frequency-aware transformation☆63May 2, 2021Updated 5 years ago
- An Tensorflow.keras implementation of Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorizatio…☆10Dec 18, 2019Updated 6 years ago
- Unsupervised 3D Object Detection [NeurIPS 2024]☆45Feb 12, 2026Updated 3 months ago
- CAVEDU出版之Jetson Nano書籍範例☆10Mar 2, 2026Updated 3 months ago
- Code written for OpenCV during GSoC 2019 related to Facial Landmark Detection☆10Aug 26, 2019Updated 6 years ago
- [ECCV 2024] CLAMP-ViT: Contrastive Data-Free Learning for Adaptive Post-Training Quantization of ViTs☆18Jul 2, 2024Updated last year
- Learning Accurate Decision Trees with Bandit Feedback via Quantized Gradient Descent☆16Sep 8, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Code for our paper at ECCV 2020: Post-Training Piecewise Linear Quantization for Deep Neural Networks☆68Nov 4, 2021Updated 4 years ago
- Repository for the QUIK project, enabling the use of 4bit kernels for generative inference - EMNLP 2024☆185Apr 16, 2024Updated 2 years ago
- Extract your SlidesLive presentation.☆15Apr 19, 2024Updated 2 years ago
- Code needed to reproduce results from my ICLR 2019 paper on fixed-point quantization of the backprop algorithm.☆10Jan 24, 2019Updated 7 years ago
- Supercharge Your PyTorch Image Models: Bag of Tricks to 8x Faster Inference with ONNX Runtime & Optimizations☆24Oct 4, 2024Updated last year
- PyTorch code for full quantization of DNN using BCGD☆14Jul 24, 2019Updated 6 years ago
- ☆19Mar 16, 2022Updated 4 years ago