FrancescoSaverioZuppichini / pytorch-2.0-benchmarkLinks

Benchmarking PyTorch 2.0 different models

☆20

Alternatives and similar repositories for pytorch-2.0-benchmark

Users that are interested in pytorch-2.0-benchmark are comparing it to the libraries listed below

Sorting:

kshitij12345 / torchnnprofiler
Context Manager to profile the forward and backward times of PyTorch's nn.Module
☆83Updated last year
graphcore-research / out-of-the-box-fp8-training
Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.
☆46Updated last year
lianakoleva / no-libtorch-compile
☆21Updated 5 months ago
lernapparat / torchhacks
Hacks for PyTorch
☆19Updated 2 years ago
pytorch / torchdistx
Torch Distributed Experimental
☆117Updated last year
HabanaAI / Megatron-DeepSpeed
Intel Gaudi's Megatron DeepSpeed Large Language Models for training
☆13Updated 7 months ago
srush / triton-autodiff
Experiment of using Tangent to autodiff triton
☆79Updated last year
srush / tangent
Source-to-Source Debuggable Derivatives in Pure Python
☆15Updated last year
insoochung / transformer_bcq
BCQ tutorial for transformers
☆17Updated 2 years ago
groq / mlagility
Machine Learning Agility (MLAgility) benchmark and benchmarking tools
☆39Updated 2 months ago
lucidrains / autoregressive-linear-attention-cuda
CUDA implementation of autoregressive linear attention, with all the latest research findings
☆44Updated 2 years ago
davisyoshida / abnormal-floats
Code for the note "NF4 Isn't Information Theoretically Optimal (and that's Good)
☆19Updated 2 years ago
bigcode-project / bigcode-inference-benchmark
☆19Updated 11 months ago
drisspg / transformer_nuggets
A place to store reusable transformer components of my own creation or found on the interwebs
☆59Updated last week
tanyuqian / redco
NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inference
☆66Updated 7 months ago
rlin27 / DeBut
Codes of the paper Deformable Butterfly: A Highly Structured and Sparse Linear Transform.
☆12Updated 3 years ago
stas00 / ml-ways
ML/DL Math and Method notes
☆62Updated last year
facebookexperimental / protoquant
Prototype routines for GPU quantization written using PyTorch.
☆21Updated this week
NVIDIA / free-threaded-python
No-GIL Python environment featuring NVIDIA Deep Learning libraries.
☆63Updated 3 months ago
facebookresearch / MODel_opt
Memory Optimizations for Deep Learning (ICML 2023)
☆102Updated last year
pytorch-labs / superblock
A block oriented training approach for inference time optimization.
☆33Updated 11 months ago
pytorch / torchsnapshot
A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind…
☆158Updated last month
softmax1 / Flash-Attention-Softmax-N
CUDA and Triton implementations of Flash Attention with SoftmaxN.
☆71Updated last year
GindaChen / FlexFlashAttention3
FlexAttention w/ FlashAttention3 Support
☆27Updated 10 months ago
graphcore / tutorials
Training material for IPU users: tutorials, feature examples, simple applications
☆86Updated 2 years ago
facebookresearch / FAMBench
Benchmarks to capture important workloads.
☆31Updated 6 months ago
Quansight / pytest-pytorch
pytest plugin for a better developer experience when working with the PyTorch test suite
☆44Updated 3 years ago
jiaweizzhao / ZerO-initialization
☆74Updated 2 years ago
nod-ai / transformer-benchmarks
benchmarking some transformer deployments
☆26Updated 2 years ago
lucidrains / rela-transformer
Implementation of a Transformer using ReLA (Rectified Linear Attention) from https://arxiv.org/abs/2104.07012
☆49Updated 3 years ago