Code to reproduce some of the figures in the paper "On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima"
☆146Apr 24, 2017Updated 8 years ago
Alternatives and similar repositories for large-batch-training
Users that are interested in large-batch-training are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- SmoothOut: Smoothing Out Sharp Minima to Improve Generalization in Deep Learning☆23Nov 21, 2018Updated 7 years ago
- Code used to generate the results appearing in "Train longer, generalize better: closing the generalization gap in large batch training o…☆149May 25, 2017Updated 8 years ago
- Torch implementation reproducing MNIST experiments from DeepMind's DNI paper.☆44Mar 4, 2017Updated 9 years ago
- Analyze the dynamic stability of SGD☆13Nov 25, 2018Updated 7 years ago
- PyTorch bindings for openai-gemm☆20Feb 6, 2017Updated 9 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- DNI(Decoupled Neural Interfaces using Synthetic Gradients) implementation with Torch☆30Aug 30, 2016Updated 9 years ago
- ☆255Nov 23, 2016Updated 9 years ago
- Finalist entry for the M2CAI Workflow Challenge 2016☆10Nov 25, 2016Updated 9 years ago
- Low-rank Highway Networks☆13Mar 11, 2016Updated 10 years ago
- An empirical investigation of deep learning theory☆16Oct 3, 2019Updated 6 years ago
- Recurrent Convolutional Memory Network (in progress)☆29Apr 16, 2016Updated 10 years ago
- Implementation of Shake-Shake by chainer (Shake-Shake regularization of 3-branch residual networks: https://openreview.net/forum?id=HkO-P…☆10Aug 24, 2017Updated 8 years ago
- Optimization using Stochastic quasi-Newton methods☆42Feb 3, 2017Updated 9 years ago
- Neural network training using iterated projections.☆90Jan 17, 2017Updated 9 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆12Oct 8, 2016Updated 9 years ago
- Torch implementation of the paper "Deep Pyramidal Residual Networks" (https://arxiv.org/abs/1610.02915).☆129Oct 31, 2017Updated 8 years ago
- ☆69Dec 19, 2018Updated 7 years ago
- Movielens collaborative filtering with Solr streaming expression☆11Oct 13, 2016Updated 9 years ago
- Deep Learning Dashboard☆38Sep 4, 2016Updated 9 years ago
- Multi-Residual Networks☆23Nov 25, 2016Updated 9 years ago
- Code and models from the paper "Layer Normalization"☆243Nov 8, 2016Updated 9 years ago
- Progressive Attention Networks☆12Oct 25, 2016Updated 9 years ago
- Tweet Classification using RNN and CNN☆43Sep 18, 2016Updated 9 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Datasets for Hyperparameter Optimization of Neural Machine Translation☆10Aug 19, 2024Updated last year
- Structured Prediction Energy Networks in Torch☆132Feb 8, 2017Updated 9 years ago
- [ICLR'22] Self-supervised learning optimally robust representations for domain shift.☆25Feb 2, 2022Updated 4 years ago
- Code for visualizing the loss landscape of neural nets☆3,167Apr 5, 2022Updated 4 years ago
- Implementation of the paper [Using Fast Weights to Attend to the Recent Past](https://arxiv.org/abs/1610.06258)☆174Nov 3, 2016Updated 9 years ago
- an updated version of fb.resnet.torch with many changes.☆38Dec 16, 2016Updated 9 years ago
- A rudimentary wrapper around the fast Maxwell kernels for GEMM and convolution operations provided by nervanagpu☆34May 7, 2015Updated 10 years ago
- Unsupervised learning of visual concepts from video☆56May 5, 2016Updated 9 years ago
- pytorch implementation of Structured Bayesian Pruning☆19Jul 13, 2018Updated 7 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- 3.8% and 18.3% on CIFAR-10 and CIFAR-100☆1,312Aug 20, 2019Updated 6 years ago
- Doubly Stochastic Neighbor Embedding on Spheres☆60Sep 13, 2019Updated 6 years ago
- OptNet - Reducing memory usage in torch neural nets☆282Apr 19, 2017Updated 9 years ago
- ☆17Aug 22, 2017Updated 8 years ago
- Curated list of Machine Learning/Data Science resources☆13May 22, 2016Updated 9 years ago
- FractalNet implementation in Keras: Ultra-Deep Neural Networks without Residuals☆157Sep 17, 2017Updated 8 years ago
- A more memory efficient Torch implementation of "Densely Connected Convolutional Networks".☆29May 11, 2017Updated 8 years ago