tensorflow / meshLinks
Mesh TensorFlow: Model Parallelism Made Easier
☆1,625Updated 2 years ago
Alternatives and similar repositories for mesh
Users that are interested in mesh are comparing it to the libraries listed below
Sorting:
- PyTorch extensions for high performance and large scale training.☆3,397Updated 9 months ago
- Efficient GPU kernels for block-sparse matrix multiplication and convolution☆1,063Updated 2 years ago
- PyTorch elastic training☆728Updated 3 years ago
- Make huge neural nets fit in memory☆2,830Updated 5 years ago
- Enabling PyTorch on XLA Devices (e.g. Google TPU)☆2,748Updated last month
- FastFormers - highly efficient transformer models for NLU☆709Updated 10 months ago
- A GPipe implementation in PyTorch☆863Updated last year
- Collective communications library with various primitives for multi-machine training.☆1,391Updated 3 weeks ago
- A performant and modular runtime for TensorFlow☆753Updated 5 months ago
- Reference implementations of MLPerf® training benchmarks☆1,739Updated last month
- Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackab…☆1,587Updated last week
- FB (Facebook) + GEMM (General Matrix-Matrix Multiplication) - https://code.fb.com/ml-applications/fbgemm/☆1,525Updated this week
- Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"☆1,608Updated 5 years ago
- Training and serving large-scale neural networks with auto parallelization.☆3,183Updated 2 years ago
- Lingvo☆2,855Updated this week
- Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.☆1,008Updated last year
- Dataset, streaming, and file system extensions maintained by TensorFlow SIG-IO☆735Updated 2 months ago
- a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.☆1,542Updated 6 months ago
- A Python-level JIT compiler designed to make unmodified PyTorch programs faster.☆1,072Updated last year
- ☆1,634Updated 2 years ago
- JAX-based neural network library☆3,182Updated last week
- Library for 8-bit optimizers and quantization routines.☆780Updated 3 years ago
- TorchBench is a collection of open source benchmarks used to evaluate PyTorch performance.☆1,009Updated last week
- A benchmark framework for Tensorflow☆1,146Updated 2 years ago
- Codebase for "SLIDE : In Defense of Smart Algorithms over Hardware Acceleration for Large-Scale Deep Learning Systems"☆1,106Updated 4 years ago
- functorch is JAX-like composable function transforms for PyTorch.☆1,436Updated 5 months ago
- Task-based datasets, preprocessing, and evaluation for sequence models.☆594Updated this week
- Fast Block Sparse Matrices for Pytorch☆550Updated 5 years ago
- Bagua Speeds up PyTorch☆884Updated last year
- Parallelformers: An Efficient Model Parallelization Toolkit for Deployment☆791Updated 2 years ago