vra / flopthLinks
A simple program to calculate and visualize the FLOPs and Parameters of Pytorch models, with handy CLI and easy-to-use Python API.
☆131Updated last year
Alternatives and similar repositories for flopth
Users that are interested in flopth are comparing it to the libraries listed below
Sorting:
- Implementation of fused cosine similarity attention in the same style as Flash Attention☆219Updated 2 years ago
- TF/Keras code for DiffStride, a pooling layer with learnable strides.☆124Updated 3 years ago
- ☆52Updated 2 years ago
- Implementation of "compositional attention" from MILA, a multi-head attention variant that is reframed as a two-step attention process wi…☆51Updated 3 years ago
- Seamless analysis of your PyTorch models (RAM usage, FLOPs, MACs, receptive field, etc.)☆223Updated 9 months ago
- Code release for "Dropout Reduces Underfitting"☆317Updated 2 years ago
- Implementation of the Adan (ADAptive Nesterov momentum algorithm) Optimizer in Pytorch☆252Updated 3 years ago
- Code repository of the paper "Modelling Long Range Dependencies in ND: From Task-Specific to a General Purpose CNN" https://arxiv.org/abs…☆183Updated 7 months ago
- Implementation of Linformer for Pytorch☆303Updated 2 years ago
- Implementation of Mega, the Single-head Attention with Multi-headed EMA architecture that currently holds SOTA on Long Range Arena☆206Updated 2 years ago
- Transformers w/o Attention, based fully on MLPs☆97Updated last year
- Deep Learning project template for PyTorch (multi-gpu training is supported)☆138Updated 2 years ago
- ☆75Updated 3 years ago
- A practical implementation of GradNorm, Gradient Normalization for Adaptive Loss Balancing, in Pytorch☆124Updated 4 months ago
- Examples for the WebDataset PyTorch Dataset Library☆51Updated 4 years ago
- [ICLR 2022] "Deep AutoAugment" by Yu Zheng, Zhi Zhang, Shen Yan, Mi Zhang☆65Updated last year
- Context Manager to profile the forward and backward times of PyTorch's nn.Module☆83Updated 2 years ago
- Recent Advances in MLP-based Models (MLP is all you need!)☆117Updated 3 years ago
- Implementation of Fast Transformer in Pytorch☆177Updated 4 years ago
- Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of new…☆126Updated last year
- Pytorch cyclic cosine decay learning rate scheduler☆49Updated 4 years ago
- Memory Efficient Attention (O(sqrt(n)) for Jax and PyTorch☆184Updated 3 years ago
- Implementation of Long-Short Transformer, combining local and global inductive biases for attention over long sequences, in Pytorch☆120Updated 4 years ago
- Official PyTorch Implementation of Long-Short Transformer (NeurIPS 2021).☆228Updated 3 years ago
- A simple minimal implementation of Reversible Vision Transformers☆126Updated last year
- Adaptive Gradient Clipping☆153Updated 3 years ago
- Minimal implementation of adaptive gradient clipping (https://arxiv.org/abs/2102.06171) in TensorFlow 2.☆85Updated 4 years ago
- Estimate/count FLOPS for a given neural network using pytorch☆305Updated 3 years ago
- Implementation of Online Label Smoothing in PyTorch☆95Updated 3 years ago
- ☆133Updated 2 years ago