Research and development for optimizing transformers
☆131Feb 16, 2021Updated 5 years ago
Alternatives and similar repositories for substation
Users that are interested in substation are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- DaCe - Data Centric Parallel Programming☆582Mar 30, 2026Updated last week
- Dynamic Tensor Rematerialization prototype (modified PyTorch) and simulator. Paper: https://arxiv.org/abs/2006.09616☆133Jul 6, 2023Updated 2 years ago
- PSTensor provides a way to hack the memory management of tensors in TensorFlow and PyTorch by defining your own C++ Tensor Class.☆10Feb 10, 2022Updated 4 years ago
- An external memory allocator example for PyTorch.☆16Aug 10, 2025Updated 8 months ago
- ☆78May 4, 2021Updated 4 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- This repository has moved, please visit https://github.com/ai2cm/pace for the latest development of fv3core.☆13Dec 21, 2022Updated 3 years ago
- Distributed Communication-Optimal LU-factorization Algorithm☆12Aug 1, 2021Updated 4 years ago
- Source code repo for paper "TLDR: Token Loss Dynamic Reweighting for Reducing Repetitive Utterance Generation"☆10Aug 11, 2023Updated 2 years ago
- PipeTransformer: Automated Elastic Pipelining for Distributed Training of Large-scale Models. ICML 2021☆56Jul 21, 2021Updated 4 years ago
- FTPipe and related pipeline model parallelism research.☆44May 16, 2023Updated 2 years ago
- ☆13Mar 27, 2020Updated 6 years ago
- ☆19Jun 3, 2023Updated 2 years ago
- A library to analyze PyTorch traces.☆495Apr 1, 2026Updated last week
- Analyze network performance in distributed training☆20Oct 20, 2020Updated 5 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- ☆251Jul 25, 2024Updated last year
- The code for our paper "Neural Architecture Search as Program Transformation Exploration"☆16Apr 28, 2021Updated 4 years ago
- ☆13Jan 23, 2021Updated 5 years ago
- This is the respository that holds the artifacts of ASPLOS'25 -- M5: Mastering Page Migration and Memory Management for CXL-based Tiered …☆17Apr 1, 2025Updated last year
- Rich editor for SDFGs with included profiling and debugging, static analysis, and interactive optimization.☆22Dec 9, 2025Updated 4 months ago
- Official repository for DistFlashAttn: Distributed Memory-efficient Attention for Long-context LLMs Training☆222Aug 19, 2024Updated last year
- A Chainer extension for K-FAC☆20Jun 16, 2019Updated 6 years ago
- BytePS examples (Vision, NLP, GAN, etc)☆19Nov 24, 2022Updated 3 years ago
- PipeSwitch: Fast Pipelined Context Switching for Deep Learning Applications☆127May 9, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Standalone mini-app of the ECMWF cloud microphysics parameterization☆11Mar 30, 2026Updated last week
- Sequence-level 1F1B schedule for LLMs.☆38Aug 26, 2025Updated 7 months ago
- A GPU performance profiling tool for PyTorch models☆511Jul 13, 2021Updated 4 years ago
- MONeT framework for reducing memory consumption of DNN training☆174May 4, 2021Updated 4 years ago
- A GPipe implementation in PyTorch☆862Jul 25, 2024Updated last year
- Matrix multiplication on GPUs for matrices stored on a CPU. Similar to cublasXt, but ported to both NVIDIA and AMD GPUs.☆32Apr 2, 2025Updated last year
- A tensor-aware point-to-point communication primitive for machine learning☆286Dec 17, 2025Updated 3 months ago
- Large scale graph learning on a single machine.☆167Feb 25, 2025Updated last year
- a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.☆1,545Jul 18, 2025Updated 8 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- PyTorch extensions for high performance and large scale training.☆3,404Apr 26, 2025Updated 11 months ago
- ☆13Nov 25, 2022Updated 3 years ago
- paper and code for New Directions in Cloud Programming, CIDR 2021☆11Feb 17, 2021Updated 5 years ago
- A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.☆999Sep 19, 2024Updated last year
- Benchmark scripts for TVM☆74Mar 15, 2022Updated 4 years ago
- optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052☆479Mar 15, 2024Updated 2 years ago
- ☆10Apr 29, 2023Updated 2 years ago