ultralytics / thop
Profile PyTorch models for FLOPs and parameters, helping to evaluate computational efficiency and memory usage.
☆37Updated 3 weeks ago
Alternatives and similar repositories for thop:
Users that are interested in thop are comparing it to the libraries listed below
- Implementation of IceFormer: Accelerated Inference with Long-Sequence Transformers on CPUs (ICLR 2024).☆24Updated 10 months ago
- A library for calculating the FLOPs in the forward() process based on torch.fx☆104Updated 3 weeks ago
- FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores☆314Updated 3 months ago
- Recent Advances on Efficient Vision Transformers☆50Updated 2 years ago
- ☆143Updated 2 years ago
- ☆62Updated 5 months ago
- The official repository of Quamba☆39Updated 2 weeks ago
- Timm model explorer☆39Updated last year
- This library empowers users to seamlessly port pretrained models and checkpoints on the HuggingFace (HF) hub (developed using HF transfor…☆62Updated this week
- A simple program to calculate and visualize the FLOPs and Parameters of Pytorch models, with handy CLI and easy-to-use Python API.☆128Updated 5 months ago
- ☆48Updated last year
- ☆146Updated last year
- ☆31Updated 10 months ago
- Fast Hadamard transform in CUDA, with a PyTorch interface☆172Updated 10 months ago
- ☆203Updated 3 years ago
- Dynamic Neural Architecture Search Toolkit☆30Updated 4 months ago
- Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of new…☆120Updated 8 months ago
- Easily benchmark PyTorch model FLOPs, latency, throughput, allocated gpu memory and energy consumption☆101Updated last year
- Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Model…☆60Updated last year
- Patch convolution to avoid large GPU memory usage of Conv2D☆86Updated 3 months ago
- ☆179Updated 6 months ago
- This repository contains the experimental PyTorch native float8 training UX☆223Updated 8 months ago
- Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.☆111Updated this week
- [ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.☆348Updated last year
- ☆164Updated last year
- FlashRNN - Fast RNN Kernels with I/O Awareness☆82Updated 3 weeks ago
- Collection of kernels written in Triton language☆119Updated 2 weeks ago
- [ECCV 2024] Isomorphic Pruning for Vision Models☆66Updated 9 months ago
- Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".☆118Updated last year
- ☆43Updated 5 months ago