ultralytics / thop

Profile PyTorch models for FLOPs and parameters, helping to evaluate computational efficiency and memory usage.

☆37

Alternatives and similar repositories for thop:

Users that are interested in thop are comparing it to the libraries listed below

yuzhenmao / IceFormer
Implementation of IceFormer: Accelerated Inference with Long-Sequence Transformers on CPUs (ICLR 2024).
☆24Updated 10 months ago
zugexiaodui / torch_flops
A library for calculating the FLOPs in the forward() process based on torch.fx
☆104Updated 3 weeks ago
HazyResearch / flash-fft-conv
FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores
☆314Updated 3 months ago
MingSun-Tse / Awesome-Efficient-ViT
Recent Advances on Efficient Vision Transformers
☆50Updated 2 years ago
Qualcomm-AI-research / FP8-quantization
☆143Updated 2 years ago
mit-han-lab / tinychat-tutorial
☆62Updated 5 months ago
enyac-group / Quamba
The official repository of Quamba
☆39Updated 2 weeks ago
alenic / timm-models-explorer
Timm model explorer
☆39Updated last year
quic / efficient-transformers
This library empowers users to seamlessly port pretrained models and checkpoints on the HuggingFace (HF) hub (developed using HF transfor…
☆62Updated this week
vra / flopth
A simple program to calculate and visualize the FLOPs and Parameters of Pytorch models, with handy CLI and easy-to-use Python API.
☆128Updated 5 months ago
tridao / flash-attention-wheels
☆48Updated last year
haochengxi / Train_Transformers_with_INT4
☆146Updated last year
NVlabs / EfficientDL
☆31Updated 10 months ago
Dao-AILab / fast-hadamard-transform
Fast Hadamard transform in CUDA, with a PyTorch interface
☆172Updated 10 months ago
Qualcomm-AI-research / transformer-quantization
☆203Updated 3 years ago
IntelLabs / DyNAS-T
Dynamic Neural Architecture Search Toolkit
☆30Updated 4 months ago
lucidrains / pytorch-custom-utils
Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of new…
☆120Updated 8 months ago
LukasHedegaard / pytorch-benchmark
Easily benchmark PyTorch model FLOPs, latency, throughput, allocated gpu memory and energy consumption
☆101Updated last year
xvyaward / owq
Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Model…
☆60Updated last year
mit-han-lab / patch_conv
Patch convolution to avoid large GPU memory usage of Conv2D
☆86Updated 3 months ago
OscarXZQ / weight-selection
☆179Updated 6 months ago
pytorch-labs / float8_experimental
This repository contains the experimental PyTorch native float8 training UX
☆223Updated 8 months ago
pytorch-labs / tritonbench
Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.
☆111Updated this week
Xiuyu-Li / q-diffusion
[ICCV 2023] Q-Diffusion: Quantizing Diffusion Models.
☆348Updated last year
mit-han-lab / parallel-computing-tutorial
☆164Updated last year
NX-AI / flashrnn
FlashRNN - Fast RNN Kernels with I/O Awareness
☆82Updated 3 weeks ago
zinccat / Awesome-Triton-Kernels
Collection of kernels written in Triton language
☆119Updated 2 weeks ago
VainF / Isomorphic-Pruning
[ECCV 2024] Isomorphic Pruning for Vision Models
☆66Updated 9 months ago
IST-DASLab / OBC
Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".
☆118Updated last year
TianjinYellow / EdgeDeviceLLMCompetition-Starting-Kit
☆43Updated 5 months ago