ultralytics / thopLinks
Profile PyTorch models for FLOPs and parameters, helping to evaluate computational efficiency and memory usage.
☆57Updated last month
Alternatives and similar repositories for thop
Users that are interested in thop are comparing it to the libraries listed below
Sorting:
- Easily benchmark PyTorch model FLOPs, latency, throughput, allocated gpu memory and energy consumption☆107Updated 2 years ago
- ☆161Updated 2 years ago
- This repository contains the training code of ParetoQ introduced in our work "ParetoQ Scaling Laws in Extremely Low-bit LLM Quantization"☆106Updated 4 months ago
- ☆73Updated 11 months ago
- Timm model explorer☆42Updated last year
- [CVPR 2024] PTQ4SAM: Post-Training Quantization for Segment Anything☆80Updated last year
- QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX☆161Updated this week
- ☆206Updated 3 years ago
- Recent Advances on Efficient Vision Transformers☆53Updated 2 years ago
- Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".☆128Updated 2 years ago
- High Performance Int8 GEMM Kernels for SM80 and later GPUs.☆16Updated 6 months ago
- A repository dedicated to evaluating the performance of quantizied LLaMA3 using various quantization methods..☆195Updated 8 months ago
- List of papers related to Vision Transformers quantization and hardware acceleration in recent AI conferences and journals.☆94Updated last year
- FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores☆329Updated 9 months ago
- [NeurIPS 2023] MCUFormer: Deploying Vision Transformers on Microcontrollers with Limited Memory☆72Updated last year
- ☆156Updated 2 years ago
- When it comes to optimizers, it's always better to be safe than sorry☆373Updated last week
- Fast Hadamard transform in CUDA, with a PyTorch interface☆245Updated last week
- PyTorch extension for emulating FP8 data formats on standard FP32 Xeon/GPU hardware.☆111Updated 10 months ago
- Model compression for ONNX☆97Updated 10 months ago
- Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and opti…☆47Updated last year
- OTOv1-v3, NeurIPS, ICLR, TMLR, DNN Training, Compression, Structured Pruning, Erasing Operators, CNN, Diffusion, LLM☆311Updated last year
- Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware. Th…☆418Updated this week
- Interactively inspect module inputs, outputs, parameters, and gradients.☆352Updated 4 months ago
- [ICML 2023] This project is the official implementation of our accepted ICML 2023 paper BiBench: Benchmarking and Analyzing Network Binar…☆56Updated last year
- Implementation of Post-training Quantization on Diffusion Models (CVPR 2023)☆139Updated 2 years ago
- Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Model…☆66Updated last year
- [ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.☆356Updated last year
- This library empowers users to seamlessly port pretrained models and checkpoints on the HuggingFace (HF) hub (developed using HF transfor…☆80Updated last week
- ☆174Updated 2 years ago