A framework to compare low-bit integer and float-point formats
☆66Feb 6, 2026Updated 3 weeks ago
Alternatives and similar repositories for INT_vs_FP
Users that are interested in INT_vs_FP are comparing it to the libraries listed below
Sorting:
- ☆11Jan 10, 2025Updated last year
- [NeurIPS 2023] ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer☆30Dec 6, 2023Updated 2 years ago
- Model Quantization Benchmark☆18Sep 30, 2025Updated 4 months ago
- ☆17Jan 22, 2025Updated last year
- The official implementation of the ICML 2023 paper OFQ-ViT☆39Oct 3, 2023Updated 2 years ago
- High Performance FP8 GEMM Kernels for SM89 and later GPUs.☆20Jan 24, 2025Updated last year
- ☆20Mar 25, 2025Updated 11 months ago
- [TMLR] Official PyTorch implementation of paper "Quantization Variation: A New Perspective on Training Transformers with Low-Bit Precisio…☆48Sep 27, 2024Updated last year
- PyTorch implementation of Near-Lossless Post-Training Quantization of Deep Neural Networks via a Piecewise Linear Approximation☆23Feb 17, 2020Updated 6 years ago
- IntLLaMA: A fast and light quantization solution for LLaMA☆18Jul 21, 2023Updated 2 years ago
- FireQ: Fast INT4-FP8 Kernel and RoPE-aware Quantization for LLM Inference Acceleration☆20Jun 27, 2025Updated 8 months ago
- ☆19Nov 6, 2023Updated 2 years ago
- ☆25Oct 31, 2024Updated last year
- Code Repository of Evaluating Quantized Large Language Models☆136Sep 8, 2024Updated last year
- Code for the paper “Four Over Six: More Accurate NVFP4 Quantization with Adaptive Block Scaling”☆128Updated this week
- Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and opti…☆50Oct 21, 2023Updated 2 years ago
- AFPQ code implementation☆23Nov 6, 2023Updated 2 years ago
- ☆63Jul 21, 2024Updated last year
- The code repository of "MBQ: Modality-Balanced Quantization for Large Vision-Language Models"☆79Mar 17, 2025Updated 11 months ago
- Pytorch implementation of "Oscillation-Reduced MXFP4 Training for Vision Transformers" on DeiT Model Pre-training☆36Jun 20, 2025Updated 8 months ago
- ☆25Dec 11, 2021Updated 4 years ago
- LLM Inference with Microscaling Format☆34Nov 12, 2024Updated last year
- [TMLR] Official PyTorch implementation of paper "Efficient Quantization-aware Training with Adaptive Coreset Selection"☆37Aug 20, 2024Updated last year
- ☆32Mar 31, 2025Updated 10 months ago
- [ICML 2024] Sparse Model Inversion: Efficient Inversion of Vision Transformers with Less Hallucination☆13Apr 29, 2025Updated 9 months ago
- FFNet: MetaMixer-based Efficient Convolutional Mixer Design☆31Mar 11, 2025Updated 11 months ago
- ☆79Jul 21, 2022Updated 3 years ago
- Post-training sparsity-aware quantization☆34Feb 26, 2023Updated 3 years ago
- A collection of research papers on low-precision training methods☆64May 10, 2025Updated 9 months ago
- [ICLR 2026] ParallelBench: Understanding the Tradeoffs of Parallel Decoding in Diffusion LLMs☆30Updated this week
- Implementing activation functions from scratch in Tensorflow.☆36Feb 13, 2022Updated 4 years ago
- NART = NART is not A RunTime, a deep learning inference framework.☆37Mar 2, 2023Updated 2 years ago
- ☆37Jun 1, 2022Updated 3 years ago
- A Simple, Explainable Vision Language Model for detecting manifacturing defects into products☆14Sep 23, 2025Updated 5 months ago
- ☆10Apr 24, 2024Updated last year
- A simple implementation about LEGv8 instruction set using Verilog HDL.☆11May 8, 2022Updated 3 years ago
- Sample repository for my awesome Youtube viewers.☆10Jun 3, 2020Updated 5 years ago
- PyTorch Quantization Framework For OCP MX Datatypes.☆16May 30, 2025Updated 8 months ago
- QuTLASS: CUTLASS-Powered Quantized BLAS for Deep Learning☆166Nov 11, 2025Updated 3 months ago