eric-haibin-lin / mxnet
Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Go, Javascript and more
☆10Updated 4 years ago
Alternatives and similar repositories for mxnet:
Users that are interested in mxnet are comparing it to the libraries listed below
- PMLS-Caffe: Distributed Deep Learning Framework for Parallel ML System☆194Updated 6 years ago
- Deep learning system course☆218Updated 6 years ago
- Just-in-time Dynamic Batching with MXNet Gluon.☆52Updated 4 years ago
- Implements an efficient softmax approximation as described in the paper "Efficient softmax approximation for GPUs" (http://arxiv.org/abs/…☆393Updated 5 years ago
- MPI for Torch☆60Updated 7 years ago
- ☆372Updated 7 years ago
- GPU-specialized parameter server for GPU machine learning.☆100Updated 6 years ago
- An example of data parallelism and async updates of parameter in tensorflow.☆121Updated 6 years ago
- CUDA Matrix Factorization Library with Stochastic Gradient Descent (SGD)☆71Updated 7 years ago
- Papers and blogs related to distributed deep learning☆96Updated 7 years ago
- ☆127Updated 8 years ago
- A implementation of CF-NADE. Yin Zheng, et. al. "A Neural Autoregressive Approach to Collaborative Filtering", accepted by ICML 2016.☆79Updated 6 years ago
- auto-tuning momentum SGD optimizer☆286Updated 5 years ago
- mpi-caffe☆49Updated 5 years ago
- Language Modeling☆156Updated 5 years ago
- Benchmarks for several RNN variations with different deep-learning frameworks☆169Updated 5 years ago
- (Spring 2017) Assignment 2: GPU Executor☆62Updated 7 years ago
- LR、FM model solved by ftrl and sgd parallel on MPI☆111Updated 7 years ago
- MXNet based Neural Machine Translation☆118Updated 6 years ago
- Benchmarking State-of-the-Art Deep Learning Software Tools☆170Updated 7 years ago
- ☆87Updated 8 years ago
- Multi-GPU mini-framework for Theano☆195Updated 7 years ago
- Efficient layer normalization GPU kernel for Tensorflow☆110Updated 7 years ago
- Distributed Factorization Machines☆297Updated 8 years ago
- Reliable Allreduce and Broadcast Interface for distributed machine learning☆509Updated 4 years ago
- ☆108Updated 7 years ago
- sparse word2vec☆108Updated 2 years ago
- Code and models from the paper "Layer Normalization"☆244Updated 8 years ago
- CUDA Matrix Factorization Library with Alternating Least Square (ALS)☆176Updated 6 years ago
- Documentation for StreamExecutor open source proposal☆83Updated 8 years ago