pashu123 / Transformers
Pytorch Implementation of Transformers Explained with Comments
☆15Updated 4 years ago
Related projects: ⓘ
- RevBiFPN: The Fully Reversible Bidirectional Feature Pyramid Network☆12Updated last year
- Artifact for IPDPS'21: DSXplore: Optimizing Convolutional Neural Networks via Sliding-Channel Convolutions.☆13Updated 3 years ago
- A implement of run-length encoding for Pytorch tensor using CUDA☆10Updated 3 years ago
- A highly modular PyTorch framework with a focus on Neural Architecture Search (NAS).☆22Updated 2 years ago
- Official PyTorch implementation of LilNetX: Lightweight Networks with EXtreme Model Compression and Structured Sparsification☆44Updated 2 years ago
- A "gym" style toolkit for building lightweight NAS systems.☆13Updated 2 years ago
- Easily benchmark PyTorch model FLOPs, latency, throughput, allocated gpu memory and energy consumption☆87Updated last year
- Udacity CS344 Introduction to Parallell Programming (https://classroom.udacity.com/courses/cs344), with assignments/materials updated to …☆45Updated 3 years ago
- Fairring (FAIR + Herring) is a plug-in for PyTorch that provides a process group for distributed training that outperforms NCCL at large …☆61Updated 2 years ago
- [ECCV 2022] SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning☆19Updated 2 years ago
- Artifacts for SOSP'19 paper Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions☆21Updated 2 years ago
- My tests and experiments with some popular dl frameworks.☆11Updated 2 weeks ago
- Context Manager to profile the forward and backward times of PyTorch's nn.Module☆83Updated 11 months ago
- ☆48Updated last year
- Official implementation of "UNAS: Differentiable Architecture Search Meets Reinforcement Learning", CVPR 2020 Oral☆59Updated 11 months ago
- Train neural networks with joint quantization and pruning on both weights and activations using any pytorch modules☆40Updated 2 years ago
- A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training☆20Updated 2 years ago
- "Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices", official implementation☆28Updated last year
- ☆30Updated 3 months ago
- You Only Search Once: On Lightweight Differentiable Architecture Search for Resource-Constrained Embedded Platforms☆10Updated last year
- A collection of metrics to profile a single deep learning model or compare two different deep learning models☆24Updated 10 months ago
- Memory Optimizations for Deep Learning (ICML 2023)☆58Updated 6 months ago
- DeltaCNN End-to-End CNN Inference of Sparse Frame Differences in Videos☆60Updated last year
- a high performance system for customized-precision distributed deep learning☆12Updated 3 years ago
- Arch-Net: Model Distillation for Architecture Agnostic Model Deployment☆22Updated 2 years ago
- All about acceleration and compression of Deep Neural Networks☆33Updated 4 years ago
- MONeT framework for reducing memory consumption of DNN training☆172Updated 3 years ago
- An open source implementation of CLIP.☆32Updated last year
- An external memory allocator example for PyTorch.☆13Updated 2 years ago
- Implementation of Kronecker Attention in Pytorch☆17Updated 4 years ago