EricDarve / cme213_material_2013
☆62Updated this week
Related projects: ⓘ
- (Spring 2017) Assignment 2: GPU Executor☆63Updated 7 years ago
- ☆19Updated 6 years ago
- MPI Parallel framework for training deep learning models built in Theano☆53Updated 7 years ago
- TensorFlow util for building memory usage timeline from LOG_MEMORY messages☆65Updated 6 years ago
- Simple example of implementing a new Tensorflow operation and its gradient in C++.☆56Updated 5 years ago
- Distributed Learning by Pair-Wise Averaging☆53Updated 6 years ago
- MPI for Torch☆61Updated 7 years ago
- Cyclades☆28Updated 6 years ago
- Multi-core CPU implementation of deep learning for 2D and 3D sliding window convolutional networks (ConvNets).☆94Updated 7 years ago
- Python Binding to NVRTC☆79Updated 6 years ago
- Efficient layer normalization GPU kernel for Tensorflow☆111Updated 7 years ago
- Convolutional Neural Networks for Visual Recognition (kNN, softmax, etc)☆34Updated 7 years ago
- The Operator Vectorization Library, or OVL, is a python productivity library for defining high performance custom operators for the Tenso…☆68Updated 7 years ago
- DrMAD☆108Updated 6 years ago
- mpi-caffe☆49Updated 5 years ago
- FRED simulator and associated paper☆26Updated 8 years ago
- An implementation of the Hessian-free optimization algorithm in Theano☆61Updated 12 years ago
- Papers and blogs related to distributed deep learning☆97Updated 6 years ago
- Proof of concept prototype to perform distributed training using BVLC/caffe, based on a parameter server implementation using MPI. Data p…☆13Updated 9 years ago
- This is my original repository of the decaf code. Decaf is a precursor of Caffe written in Python for deep image classification. It is de…☆43Updated 10 years ago
- TensorFlow kernels for probing memory☆15Updated 7 years ago
- ☆60Updated this week
- Tensorflow implementation of SGD with Coupled Adaptive Batch Size (CABS)☆43Updated 7 years ago
- Sublinear memory optimization for deep learning, reduce GPU memory cost to train deeper nets☆28Updated 8 years ago
- Asynchronous One Step Q Learning implemented with MXNET☆20Updated 7 years ago
- A platform for distributed optimization expriments using OpenMPI☆20Updated 6 years ago
- Convolution op for Theano based on CuFFT using scikits.cuda☆51Updated 10 years ago
- ☆72Updated this week
- Source code for ``Neural Networks with Few Multiplications'' published at ICLR 2016☆81Updated 8 years ago