mesh-umn / TF.AKOLinks
TensorFlow implementation of a decentralized distributed deep learning, AKO.
☆10Updated 7 years ago
Alternatives and similar repositories for TF.AKO
Users that are interested in TF.AKO are comparing it to the libraries listed below
Sorting:
- Enhanced networking support for TensorFlow. Maintained by SIG-networking.☆98Updated 4 years ago
- Serverless ML Framework☆106Updated 3 years ago
- Crossbow: A Multi-GPU Deep Learning System for Training with Small Batch Sizes☆56Updated 3 years ago
- A fully adaptive, zero-tuning parameter manager that enables efficient distributed machine learning training☆21Updated 2 years ago
- A Distributed Multi-GPU System for Fast Graph Processing☆65Updated 7 years ago
- distributed-embeddings is a library for building large embedding based models in Tensorflow 2.☆46Updated 2 years ago
- A NUMA-aware Graph-structured Analytics Framework☆44Updated 7 years ago
- ☆24Updated 2 years ago
- BytePS examples (Vision, NLP, GAN, etc)☆19Updated 3 years ago
- Edge-centric Graph Processing System using Streaming Partitions☆82Updated 7 years ago
- gossip: Efficient Communication Primitives for Multi-GPU Systems☆59Updated 3 years ago
- Differentiated Computation and Partitioning on Skewed (Natural or Bipartite) Graphs☆67Updated 3 years ago
- Machine Learning System☆14Updated 5 years ago
- A C++ Pregel Clone with dynamic load balancing, based on a paper "Mizan: A System for Dynamic Load Balancing in Large-scale Graph Process…☆27Updated 11 years ago
- Exoshuffle-CloudSort☆27Updated 2 years ago
- ☆15Updated 2 years ago
- A lightweight parameter server interface☆86Updated 2 years ago
- ☆21Updated 3 years ago
- ComScribe is a tool to identify communication among all GPU-GPU and CPU-GPU pairs in a single-node multi-GPU system.☆27Updated 2 years ago
- Tiresias is a GPU cluster manager for distributed deep learning training.☆164Updated 5 years ago
- ☆19Updated 5 years ago
- Chaos: Scale-out Graph Processing from Secondary Storage☆50Updated 9 years ago
- Code for Ernest☆33Updated 2 years ago
- A high-performance, scalable and efficient ShuffleManager plugin for Apache Spark, utilizing UCX communication layer☆51Updated 2 years ago
- Multi-threaded Large-Scale RMAT Graph Generator.☆132Updated 2 years ago
- Tensorflow is a computational library using data flow graphs for scalable machine learning, and Tensorflow-RDMA is the implementation ov…☆58Updated 3 years ago
- OpenEmbedding is an open source framework for Tensorflow distributed training acceleration.☆33Updated 2 years ago
- Elastic ephemeral storage☆122Updated 3 years ago
- Fast and Adaptive Distributed Machine Learning for TensorFlow, PyTorch and MindSpore.☆296Updated last year
- A general-purpose, distributed graph random walk engine.☆109Updated 2 years ago