tensorflow / mesh
Mesh TensorFlow: Model Parallelism Made Easier
☆1,591Updated 11 months ago
Related projects ⓘ
Alternatives and complementary repositories for mesh
- PyTorch extensions for high performance and large scale training.☆3,187Updated 2 months ago
- Enabling PyTorch on XLA Devices (e.g. Google TPU)☆2,482Updated this week
- PyTorch elastic training☆730Updated 2 years ago
- Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"☆1,524Updated 4 years ago
- Lingvo☆2,816Updated this week
- Efficient GPU kernels for block-sparse matrix multiplication and convolution☆1,025Updated last year
- Make huge neural nets fit in memory☆2,726Updated 4 years ago
- JAX-based neural network library☆2,894Updated last week
- Fast Block Sparse Matrices for Pytorch☆545Updated 3 years ago
- FastFormers - highly efficient transformer models for NLU☆701Updated 9 months ago
- Codebase for "SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems"☆1,076Updated 3 years ago
- A GPipe implementation in PyTorch☆814Updated 3 months ago
- jiant is an nlp toolkit☆1,644Updated last year
- PyTorch original implementation of Cross-lingual Language Model Pretraining.☆2,889Updated last year
- Long Range Arena for Benchmarking Efficient Transformers☆727Updated 10 months ago
- Useful extra functionality for TensorFlow 2.x maintained by SIG-addons☆1,692Updated 2 months ago
- Task-based datasets, preprocessing, and evaluation for sequence models.☆558Updated this week
- Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackab…☆1,532Updated 8 months ago
- a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.☆1,483Updated last year
- Fast, general, and tested differentiable structured prediction in PyTorch☆1,108Updated 2 years ago
- Reformer, the efficient Transformer, in Pytorch☆2,116Updated last year
- Repository for the paper "Optimal Subarchitecture Extraction for BERT"☆470Updated 2 years ago
- A Python-level JIT compiler designed to make unmodified PyTorch programs faster.☆1,010Updated 6 months ago
- TorchBench is a collection of open source benchmarks used to evaluate PyTorch performance.☆874Updated this week
- Library for faster pinned CPU <-> GPU transfer in Pytorch☆683Updated 4 years ago
- ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators☆2,336Updated 7 months ago
- Pytorch library for fast transformer implementations☆1,642Updated last year
- Reference implementations of MLPerf™ training benchmarks☆1,614Updated 3 weeks ago
- functorch is JAX-like composable function transforms for PyTorch.☆1,394Updated this week
- FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/☆1,202Updated this week