tairenpiao / XNOR-popcount-GEMM-PyTorch-CPU-CUDALinks
A PyTorch implemenation of real XNOR-popcount (1-bit op) GEMM Linear PyTorch extension support both CPU and CUDA
☆22Updated 2 years ago
Alternatives and similar repositories for XNOR-popcount-GEMM-PyTorch-CPU-CUDA
Users that are interested in XNOR-popcount-GEMM-PyTorch-CPU-CUDA are comparing it to the libraries listed below
Sorting:
- Implementation of "NITI: Training Integer Neural Networks Using Integer-only Arithmetic" on arxiv☆84Updated 2 years ago
- Post-training sparsity-aware quantization☆34Updated 2 years ago
- ☆76Updated 2 years ago
- Official implementation of "Searching for Winograd-aware Quantized Networks" (MLSys'20)☆27Updated last year
- XNOR-Net, with binary gemm and binary conv2d kernels, support both CPU and GPU.☆85Updated 6 years ago
- ☆19Updated 3 years ago
- ☆17Updated 2 years ago
- Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming☆36Updated 2 years ago
- A collection of research papers on efficient training of DNNs☆70Updated 2 years ago
- ☆20Updated 3 years ago
- [ICML 2023] This project is the official implementation of our accepted ICML 2023 paper BiBench: Benchmarking and Analyzing Network Binar…☆56Updated last year
- ☆43Updated last year
- DeiT implementation for Q-ViT☆25Updated 2 months ago
- ☆149Updated 2 years ago
- The official, proof-of-concept C++ implementation of PocketNN.☆34Updated last year
- BNNs (XNOR, BNN and DoReFa) implementation for PyTorch 1.0+☆41Updated 2 years ago
- code for the paper "A Statistical Framework for Low-bitwidth Training of Deep Neural Networks"☆28Updated 4 years ago
- ☆10Updated 3 years ago
- PyTorch extension for emulating FP8 data formats on standard FP32 Xeon/GPU hardware.☆110Updated 6 months ago
- Official implementation of EMNLP'23 paper "Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM Inference?"☆22Updated last year
- Reproducing Quantization paper PACT☆64Updated 2 years ago
- Improving Post Training Neural Quantization: Layer-wise Calibration and Integer Programming☆98Updated 4 years ago
- This repository containts the pytorch scripts to train mixed-precision networks for microcontroller deployment, based on the memory contr…☆50Updated last year
- ☆35Updated 4 years ago
- [FPGA'21] CoDeNet is an efficient object detection model on PyTorch, with SOTA performance on VOC and COCO based on CenterNet and Co-Desi…☆26Updated 2 years ago
- Any-Precision Deep Neural Networks (AAAI 2021)☆60Updated 5 years ago
- Torch2Chip (MLSys, 2024)☆53Updated 2 months ago
- CMix-NN: Mixed Low-Precision CNN Library for Memory-Constrained Edge Devices☆43Updated 5 years ago
- [CVPR 2024] Offical implementation for A&B BNN: Add&Bit-Operation-Only Hardware-Friendly Binary Neural Network☆25Updated 6 months ago
- ☆47Updated 3 years ago