IBM/qattn

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/IBM/qattn)

IBM / qattn

Efficient GPU kernels for mixed-precision Vision Transformers in Triton

☆17

Alternatives and similar repositories for qattn

Users that are interested in qattn are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

cjf00000 / StatQuant
View on GitHub
code for the paper "A Statistical Framework for Low-bitwidth Training of Deep Neural Networks"
☆29Oct 31, 2020Updated 5 years ago
jgoeders / dac_sdc_2020
View on GitHub
DAC System Design Contest 2020
☆29Jun 11, 2020Updated 6 years ago
peiswang / BitSplit
View on GitHub
BitSplit Post-trining Quantization
☆49Dec 20, 2021Updated 4 years ago
ziplab / QLLM
View on GitHub
[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…
☆31Mar 12, 2024Updated 2 years ago
hustzxd / ZeroQ-MP
View on GitHub
[CVPR'20] ZeroQ Mixed-Precision implementation (unofficial): A Novel Zero Shot Quantization Framework
☆14Dec 16, 2020Updated 5 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
elliothe / Ternarized_Neural_Network
View on GitHub
Optimizing Deep Convolutional Neural Network with Ternarized Weights and High Accuracy
☆16Jan 27, 2019Updated 7 years ago
jingjing0419 / SAQ-SAM
View on GitHub
[AAAI 2026] Implementation of SAQ-SAM: Semantically-Aligned Quantization for Segment Anything Model
☆17Nov 27, 2025Updated 7 months ago
ZiweiWangTHU / GMPQ
View on GitHub
This is the pytorch implementation for the paper: Generalizable Mixed-Precision Quantization via Attribution Rank Preservation, which is…
☆24Aug 17, 2021Updated 4 years ago
ModelTC / LPCV2021_Winner_Solution
View on GitHub
☆28Nov 5, 2021Updated 4 years ago
GATECH-EIC / torchshiftadd
View on GitHub
An open-sourced PyTorch library for developing energy efficient multiplication-less models and applications.
☆14Feb 3, 2025Updated last year
1adrianb / binary-nas
View on GitHub
☆35Mar 4, 2020Updated 6 years ago
VDIGPKU / NAS-BNN
View on GitHub
The official implementation of "NAS-BNN: Neural Architecture Search for Binary Neural Networks"
☆14Aug 30, 2024Updated last year
GATECH-EIC / ShiftAddNAS
View on GitHub
[ICML 2022] ShiftAddNAS: Hardware-Inspired Search for More Accurate and Efficient Neural Networks
☆15May 18, 2022Updated 4 years ago
yukang2017 / NAS-quantization
View on GitHub
The code for Joint Neural Architecture Search and Quantization
☆14Apr 10, 2019Updated 7 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
hustvl / PD-Quant
View on GitHub
[CVPR 2023] PD-Quant: Post-Training Quantization Based on Prediction Difference Metric
☆61Mar 23, 2023Updated 3 years ago
salesforce / proto-backwards-compat-maven-plugin
View on GitHub
A Maven plugin for protecting against backwards incompatible changes to your gRPC .proto files.
☆13Jun 2, 2026Updated last month
Matesanz / mlops-cookbook
View on GitHub
This repository teaches how to train, evaluate and deploy ML models using MLFlow
☆13Oct 23, 2024Updated last year
xiezheng-cs / DTQ
View on GitHub
PyTorch implementation of "Deep Transferring Quantization" (ECCV2020)
☆18Jun 22, 2022Updated 4 years ago
GATECH-EIC / SuperTickets
View on GitHub
[ECCV 2022] SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning
☆20Jul 7, 2022Updated 4 years ago
zkkli / HTQ
View on GitHub
[PR 2024] HTQ: Exploring the High-Dimensional Trade-Off of Mixed-Precision Quantization
☆12Jul 16, 2024Updated 2 years ago
jgoeders / dac_sdc_2022
View on GitHub
☆17Jun 13, 2022Updated 4 years ago
cornell-zhang / dnn-quant-ocs
View on GitHub
DNN quantization with outlier channel splitting (ICML'19)
☆114Mar 21, 2020Updated 6 years ago
penhunt / full-quantization-DNN
View on GitHub
PyTorch code for full quantization of DNN using BCGD
☆14Jul 24, 2019Updated 6 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
haolibai / APS-channel-search
View on GitHub
Revisiting Parameter Sharing for Automatic Neural Channel Number Search, NeurIPS 2020
☆21Nov 15, 2020Updated 5 years ago
zzd1992 / FlashWindowAttention
View on GitHub
Speedup the attention computation of Swin Transformer
☆32Jun 14, 2025Updated last year
ziplab / QTool
View on GitHub
Collections of model quantization algorithms. Any issues, please contact Peng Chen (blueardour@gmail.com)
☆73Oct 7, 2021Updated 4 years ago
TedLentsch / UNION
View on GitHub
Unsupervised 3D Object Detection [NeurIPS 2024]
☆45Feb 12, 2026Updated 5 months ago
carat-project / carat-android
View on GitHub
Carat Android application repository.
☆13Mar 6, 2020Updated 6 years ago
Adamdad / Samesame
View on GitHub
An Tensorflow.keras implementation of Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorizatio…
☆10Dec 18, 2019Updated 6 years ago
So-Cool / myslideslive
View on GitHub
Extract your SlidesLive presentation.
☆15Apr 19, 2024Updated 2 years ago
jun-fang / PWLQ
View on GitHub
Code for our paper at ECCV 2020: Post-Training Piecewise Linear Quantization for Deep Neural Networks
☆68Nov 4, 2021Updated 4 years ago
IST-DASLab / QUIK
View on GitHub
Repository for the QUIK project, enabling the use of 4bit kernels for generative inference - EMNLP 2024
☆185Apr 16, 2024Updated 2 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
jgoeders / dac_sdc_2021_designs
View on GitHub
☆19Mar 16, 2022Updated 4 years ago
cavedunissin / deeplearning_robot_jetson_nano
View on GitHub
CAVEDU出版之Jetson Nano書籍範例
☆10Mar 2, 2026Updated 4 months ago
miguelag99 / Efficient-Instance-Prediction
View on GitHub
Poster at ITSC 2024
☆20Nov 12, 2024Updated last year
saiteja-talluri / GSoC-OpenCV
View on GitHub
Code written for OpenCV during GSoC 2019 related to Facial Landmark Detection
☆10Aug 26, 2019Updated 6 years ago
dnth / supercharge-your-pytorch-image-models-blogpost
View on GitHub
Supercharge Your PyTorch Image Models: Bag of Tricks to 8x Faster Inference with ONNX Runtime & Optimizations
☆24Oct 4, 2024Updated last year
microsoft / DGT
View on GitHub
Learning Accurate Decision Trees with Bandit Feedback via Quantized Gradient Descent
☆16Sep 8, 2022Updated 3 years ago
cryer / YOLOv2
View on GitHub
Keras implementation of YOLOv2 refer to Andrew Ng
☆11Feb 14, 2018Updated 8 years ago