ultralytics / thopLinks
Profile PyTorch models for FLOPs and parameters, helping to evaluate computational efficiency and memory usage.
☆59Updated this week
Alternatives and similar repositories for thop
Users that are interested in thop are comparing it to the libraries listed below
Sorting:
- Easily benchmark PyTorch model FLOPs, latency, throughput, allocated gpu memory and energy consumption☆107Updated 2 years ago
- Model Compression Toolkit (MCT) is an open source project for neural network model optimization under efficient, constrained hardware. Th…☆419Updated last week
- Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".☆129Updated 2 years ago
- OTOv1-v3, NeurIPS, ICLR, TMLR, DNN Training, Compression, Structured Pruning, Erasing Operators, CNN, Diffusion, LLM☆309Updated last year
- Timm model explorer☆42Updated last year
- Recent Advances on Efficient Vision Transformers☆54Updated 2 years ago
- ☆163Updated 2 years ago
- High Performance Int8 GEMM Kernels for SM80 and later GPUs.☆17Updated 7 months ago
- [CVPR 2024] PTQ4SAM: Post-Training Quantization for Segment Anything☆81Updated last year
- A library for calculating the FLOPs in the forward() process based on torch.fx☆129Updated 6 months ago
- ☆205Updated 3 years ago
- A simple program to calculate and visualize the FLOPs and Parameters of Pytorch models, with handy CLI and easy-to-use Python API.☆131Updated 11 months ago
- When it comes to optimizers, it's always better to be safe than sorry☆375Updated last month
- [NeurIPS 2023] MCUFormer: Deploying Vision Transformers on Microcontrollers with Limited Memory☆73Updated last year
- ☆156Updated 2 years ago
- Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and opti…☆47Updated 2 years ago
- DeltaCNN End-to-End CNN Inference of Sparse Frame Differences in Videos☆59Updated 2 years ago
- ☆74Updated 11 months ago
- List of papers related to Vision Transformers quantization and hardware acceleration in recent AI conferences and journals.☆94Updated last year
- ☆186Updated last year
- This repository contains the training code of ParetoQ introduced in our work "ParetoQ Scaling Laws in Extremely Low-bit LLM Quantization"☆108Updated 2 weeks ago
- ☆69Updated 3 months ago
- VIT inference in triton because, why not?☆31Updated last year
- This library empowers users to seamlessly port pretrained models and checkpoints on the HuggingFace (HF) hub (developed using HF transfor…☆82Updated this week
- This repository contains the experimental PyTorch native float8 training UX☆223Updated last year
- Get down and dirty with FlashAttention2.0 in pytorch, plug in and play no complex CUDA kernels☆108Updated 2 years ago
- [ICML 2023] This project is the official implementation of our accepted ICML 2023 paper BiBench: Benchmarking and Analyzing Network Binar…☆55Updated last year
- FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores☆329Updated 10 months ago
- A repository dedicated to evaluating the performance of quantizied LLaMA3 using various quantization methods..☆195Updated 9 months ago
- ☆34Updated 4 months ago