LukasHedegaard / pytorch-benchmark
Easily benchmark PyTorch model FLOPs, latency, throughput, allocated gpu memory and energy consumption
☆98Updated last year
Alternatives and similar repositories for pytorch-benchmark:
Users that are interested in pytorch-benchmark are comparing it to the libraries listed below
- ☆197Updated 3 years ago
- This repository contains the experimental PyTorch native float8 training UX☆219Updated 5 months ago
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…☆152Updated last month
- Code repo for the paper BiT Robustly Binarized Multi-distilled Transformer☆103Updated last year
- A library for researching neural networks compression and acceleration methods.☆138Updated 4 months ago
- A simple program to calculate and visualize the FLOPs and Parameters of Pytorch models, with handy CLI and easy-to-use Python API.☆124Updated last month
- A library that contains a rich collection of performant PyTorch model metrics, a simple interface to create new metrics, a toolkit to fac…☆224Updated this week
- Torch Distributed Experimental☆115Updated 5 months ago
- Demystify RAM Usage in Multi-Process Data Loaders☆187Updated last year
- Example code for profiler workshop☆33Updated 2 years ago
- DeltaCNN End-to-End CNN Inference of Sparse Frame Differences in Videos☆60Updated last year
- A research library for pytorch-based neural network pruning, compression, and more.☆160Updated 2 years ago
- ☆131Updated last year
- 🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.☆182Updated this week
- Code repo for the paper "LLM-QAT Data-Free Quantization Aware Training for Large Language Models"☆265Updated 4 months ago
- ☆157Updated last year
- Simplification of pruned models for accelerated inference | SoftwareX https://doi.org/10.1016/j.softx.2021.100907☆35Updated last year
- Context Manager to profile the forward and backward times of PyTorch's nn.Module☆84Updated last year
- Implementation of a Transformer, but completely in Triton☆251Updated 2 years ago
- Neural Architecture Search for Neural Network Libraries☆58Updated 11 months ago
- Collection of SOTA efficient computer vision models for embedded applications, with pre-trained weights and training recipes☆88Updated 3 weeks ago
- Seamless analysis of your PyTorch models (RAM usage, FLOPs, MACs, receptive field, etc.)☆213Updated this week
- [NeurIPS 2022] A Fast Post-Training Pruning Framework for Transformers☆181Updated last year
- PyTorch extension for emulating FP8 data formats on standard FP32 Xeon/GPU hardware.☆103Updated last month
- pytorch-profiler☆50Updated last year
- VIT inference in triton because, why not?☆22Updated 7 months ago
- Code for the NeurIPS 2022 paper "Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning".☆109Updated last year
- Cataloging released Triton kernels.☆156Updated last week
- A minimal implementation of vllm.☆32Updated 5 months ago