aredden / torch-bnb-fp4View external linksLinks
Faster Pytorch bitsandbytes 4bit fp4 nn.Linear ops
☆30Mar 16, 2024Updated last year
Alternatives and similar repositories for torch-bnb-fp4
Users that are interested in torch-bnb-fp4 are comparing it to the libraries listed below
Sorting:
- IntLLaMA: A fast and light quantization solution for LLaMA☆18Jul 21, 2023Updated 2 years ago
- Code Repository for the NeurIPS 2022 paper: "Hyper-Representations as Generative Models: Sampling Unseen Neural Network Weights".☆17Jul 10, 2024Updated last year
- ACL 2023☆39Jun 6, 2023Updated 2 years ago
- This is a repository of Binary General Matrix Multiply (BGEMM) by customized CUDA kernel. Thank FP6-LLM for the wheels!☆18Aug 30, 2024Updated last year
- TACOTRON: TOWARDS END-TO-END SPEECH SYNTHESIS☆16Sep 26, 2017Updated 8 years ago
- PyTorch implementation of "Deep Transferring Quantization" (ECCV2020)☆18Jun 22, 2022Updated 3 years ago
- [EMNLP 2024] Official implementation of "Hierarchical Deconstruction of LLM Reasoning: A Graph-Based Framework for Analyzing Knowledge Ut…☆23Dec 4, 2024Updated last year
- The official implementation of the paper "Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation"☆21Dec 10, 2024Updated last year
- The official PyTorch implementation of the NeurIPS2022 (spotlight) paper, Outlier Suppression: Pushing the Limit of Low-bit Transformer L…☆49Oct 5, 2022Updated 3 years ago
- ☆19Nov 6, 2023Updated 2 years ago
- ☆21Feb 11, 2022Updated 4 years ago
- torch_quantizer is a out-of-box quantization tool for PyTorch models on CUDA backend, specially optimized for Diffusion Models.☆23Mar 29, 2024Updated last year
- The official implementation of PTQD: Accurate Post-Training Quantization for Diffusion Models☆103Mar 12, 2024Updated last year
- Post-Training Quantization for Vision transformers.☆238Jul 19, 2022Updated 3 years ago
- [TMLR] Official PyTorch implementation of paper "Efficient Quantization-aware Training with Adaptive Coreset Selection"☆37Aug 20, 2024Updated last year
- Codebase for fine-tuning Llama2 70B to generate math test questions and answers.☆11Aug 30, 2024Updated last year
- PyTorch half precision gemm lib w/ fused optional bias + optional relu/gelu☆78Dec 3, 2024Updated last year
- High-speed GEMV kernels, at most 2.7x speedup compared to pytorch baseline.☆128Jul 13, 2024Updated last year
- Implementation of ICLR 2018 paper "Loss-aware Weight Quantization of Deep Networks"☆27Oct 24, 2019Updated 6 years ago
- Applied AI experiments and examples for PyTorch☆315Aug 22, 2025Updated 5 months ago
- [ICLR'25] ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation☆149Mar 21, 2025Updated 10 months ago
- Concurrency library☆16Oct 13, 2024Updated last year
- Repo for paper "CODIS: Benchmarking Context-Dependent Visual Comprehension for Multimodal Large Language Models".☆12Oct 14, 2024Updated last year
- ☆11Dec 23, 2024Updated last year
- Implementation of Post-training Quantization on Diffusion Models (CVPR 2023)☆141Apr 1, 2023Updated 2 years ago
- Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.☆482Nov 26, 2024Updated last year
- This repository contains the experimental PyTorch native float8 training UX☆226Aug 1, 2024Updated last year
- [ICML 2025] SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity☆71Jul 5, 2025Updated 7 months ago
- [ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Mod…☆39Mar 11, 2024Updated last year
- extensible collectives library in triton☆95Mar 31, 2025Updated 10 months ago
- ☆235Jun 11, 2024Updated last year
- This is the official pytorch implementation for the paper: Towards Accurate Post-training Quantization for Diffusion Models.(CVPR24 Poste…☆38Jun 4, 2024Updated last year
- Material parsers and other tools, scripts Initially developed for Grobid Superconductor☆13Feb 21, 2025Updated 11 months ago
- Develop C++/CUDA extensions with PyTorch like Python scripts☆10Jan 7, 2026Updated last month
- My configures and setup when installing a new machine.☆11Jul 30, 2023Updated 2 years ago
- Python Inference Script(PyIS)☆19Aug 30, 2022Updated 3 years ago
- [CVPR 2025 Highlight] FIMA-Q: Post-Training Quantization for Vision Transformers by Fisher Information Matrix Approximation☆25Jun 16, 2025Updated 7 months ago
- [AAAI2024] An official pytorch implement of the paper: Vision-Language Pre-training with Object Contrastive Learning for 3D Scene Underst…☆13Dec 8, 2024Updated last year
- ☆95Nov 16, 2025Updated 2 months ago