MethodsOfMachineLearning / cabsLinks
Tensorflow implementation of SGD with Coupled Adaptive Batch Size (CABS)
☆44Updated 8 years ago
Alternatives and similar repositories for cabs
Users that are interested in cabs are comparing it to the libraries listed below
Sorting:
- DeepArchitect: Automatically Designing and Training Deep Architectures☆147Updated 5 years ago
- Efficient layer normalization GPU kernel for Tensorflow☆111Updated 8 years ago
- Code for Attentive Recurrent Comparators☆57Updated 8 years ago
- DrMAD☆107Updated 7 years ago
- Reproduction of some of the results from 'Identity Mappings in Deep Residual Networks'☆72Updated 8 years ago
- Reference caffe implementation of LSUV initialization☆114Updated 7 years ago
- DNI(Decoupled Neural Interfaces using Synthetic Gradients) implementation with Torch☆29Updated 8 years ago
- Lasagne code for weight normalization☆88Updated 9 years ago
- Architecture learning for CNN's☆37Updated 8 years ago
- Second-order optimiser for deep networks☆76Updated 6 years ago
- ☆29Updated 8 years ago
- Flattened convolutional neural networks (1D convolution modules for Torch nn)☆61Updated 9 years ago
- A rudimentary wrapper around the fast Maxwell kernels for GEMM and convolution operations provided by nervanagpu☆34Updated 10 years ago
- Cluttered MNIST Dataset☆52Updated 10 years ago
- Torch implementation of the Deep Network for Global Optimization (DNGO)☆51Updated 8 years ago
- A pytorch implementation of "Self-Normalizing Neural Networks" by Klambauer et al. (still beta)☆59Updated 8 years ago
- A new kind of pooling layer for faster and sharper convergence☆76Updated 7 years ago
- numpy implementation of net 2 net from the paper Net2Net: Accelerating Learning via Knowledge Transfer http://arxiv.org/abs/1511.05641☆53Updated 9 years ago
- Reference implementation for Structured Prediction with Deep Value Networks☆55Updated 8 years ago
- Source code for ``Neural Networks with Few Multiplications'' published at ICLR 2016☆81Updated 9 years ago
- ☆38Updated 7 years ago
- RNNprop☆36Updated 8 years ago
- ☆69Updated 8 years ago
- ☆69Updated 6 years ago
- Torch implementation reproducing MNIST experiments from DeepMind's DNI paper.☆43Updated 8 years ago
- ☆35Updated 8 years ago
- ACDC: A Structured Efficient Linear Layer☆44Updated 9 years ago
- TensorFlow implementation of the paper "Learning to learn by gradient descent by gradient descent ( https://arxiv.org/abs/1606.04474 )"☆84Updated 8 years ago
- Code used to generate the results appearing in "Train longer, generalize better: closing the generalization gap in large batch training o…☆149Updated 8 years ago
- Fractional Max Pooling implementation in Theano☆21Updated 9 years ago