tensorflow / mesh
Mesh TensorFlow: Model Parallelism Made Easier
☆1,591Updated last year
Related projects ⓘ
Alternatives and complementary repositories for mesh
- PyTorch extensions for high performance and large scale training.☆3,195Updated last week
- Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"☆1,524Updated 4 years ago
- Efficient GPU kernels for block-sparse matrix multiplication and convolution☆1,027Updated last year
- PyTorch elastic training☆730Updated 2 years ago
- Make huge neural nets fit in memory☆2,730Updated 4 years ago
- JAX-based neural network library☆2,909Updated last week
- Enabling PyTorch on XLA Devices (e.g. Google TPU)☆2,489Updated this week
- Useful extra functionality for TensorFlow 2.x maintained by SIG-addons☆1,694Updated 2 months ago
- Longformer: The Long-Document Transformer☆2,047Updated last year
- Making text a first-class citizen in TensorFlow.☆1,233Updated this week
- ☆2,686Updated this week
- FastFormers - highly efficient transformer models for NLU☆701Updated 10 months ago
- jiant is an nlp toolkit☆1,647Updated last year
- ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators☆2,339Updated 7 months ago
- Model analysis tools for TensorFlow☆1,256Updated 2 weeks ago
- PyTorch original implementation of Cross-lingual Language Model Pretraining.☆2,892Updated last year
- A GPipe implementation in PyTorch☆818Updated 3 months ago
- The implementation of DeBERTa☆1,991Updated last year
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆1,338Updated 8 months ago
- Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"☆6,178Updated 2 months ago
- Collective communications library with various primitives for multi-machine training.☆1,227Updated this week
- A performant and modular runtime for TensorFlow☆756Updated last month
- Task-based datasets, preprocessing, and evaluation for sequence models.☆561Updated this week
- Long Range Arena for Benchmarking Efficient Transformers☆729Updated 11 months ago
- a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.☆1,484Updated last year
- Transformers for Longer Sequences☆572Updated 2 years ago
- Open clone of OpenAI's unreleased WebText dataset scraper. This version uses pushshift.io files instead of the API for speed.☆714Updated last year
- Reformer, the efficient Transformer, in Pytorch☆2,121Updated last year
- Fast Block Sparse Matrices for Pytorch☆545Updated 3 years ago
- Lingvo☆2,816Updated this week