LukasHedegaard / pytorch-benchmarkLinks
Easily benchmark PyTorch model FLOPs, latency, throughput, allocated gpu memory and energy consumption
☆103Updated last year
Alternatives and similar repositories for pytorch-benchmark
Users that are interested in pytorch-benchmark are comparing it to the libraries listed below
Sorting:
- A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…☆158Updated 3 weeks ago
- Torch Distributed Experimental☆116Updated 11 months ago
- Context Manager to profile the forward and backward times of PyTorch's nn.Module☆83Updated last year
- A library that contains a rich collection of performant PyTorch model metrics, a simple interface to create new metrics, a toolkit to fac…☆235Updated 6 months ago
- This repository contains the experimental PyTorch native float8 training UX☆224Updated 11 months ago
- The Triton backend for the PyTorch TorchScript models.☆156Updated last week
- Code repo for the paper BiT Robustly Binarized Multi-distilled Transformer☆109Updated 2 years ago
- Memory Optimizations for Deep Learning (ICML 2023)☆64Updated last year
- ☆157Updated last year
- Dynamic Neural Architecture Search Toolkit☆30Updated 7 months ago
- 🚀 Collection of components for development, training, tuning, and inference of foundation models leveraging PyTorch native components.☆206Updated this week
- A simple program to calculate and visualize the FLOPs and Parameters of Pytorch models, with handy CLI and easy-to-use Python API.☆129Updated 7 months ago
- Memory-Efficient CUDA kernels for training ConvNets with PyTorch.☆42Updated 4 months ago
- FlashFFTConv: Efficient Convolutions for Long Sequences with Tensor Cores☆323Updated 6 months ago
- ☆133Updated last year
- Fast Hadamard transform in CUDA, with a PyTorch interface☆206Updated last year
- torch::deploy (multipy for non-torch uses) is a system that lets you get around the GIL problem by running multiple Python interpreters i…☆180Updated last week
- ☆33Updated last month
- VIT inference in triton because, why not?☆30Updated last year
- Benchmark Suite for Deep Learning☆271Updated 4 months ago
- TorchFix - a linter for PyTorch-using code with autofix support☆143Updated 5 months ago
- A Python library transfers PyTorch tensors between CPU and NVMe☆116Updated 7 months ago
- Seamless analysis of your PyTorch models (RAM usage, FLOPs, MACs, receptive field, etc.)☆218Updated 3 months ago
- Implementation of fused cosine similarity attention in the same style as Flash Attention☆214Updated 2 years ago
- Get down and dirty with FlashAttention2.0 in pytorch, plug in and play no complex CUDA kernels☆105Updated last year
- DeltaCNN End-to-End CNN Inference of Sparse Frame Differences in Videos☆59Updated 2 years ago
- ML model training for edge devices☆165Updated last year
- ☆206Updated 3 years ago
- PyTorch extension for emulating FP8 data formats on standard FP32 Xeon/GPU hardware.☆110Updated 7 months ago
- MLPerf™ logging library☆37Updated 2 months ago