johnkorn / distillation
Keras + tensorflow experiments with knowledge distillation on EMNIST dataset
☆35Updated 7 years ago
Alternatives and similar repositories for distillation:
Users that are interested in distillation are comparing it to the libraries listed below
- A machine learning experiment☆182Updated 7 years ago
- Knowledge Distillation using Tensorflow☆141Updated 5 years ago
- ☆18Updated 5 years ago
- Keras implementation of temporal ensembling(semi-supervised learning)☆22Updated 6 years ago
- An implementation for mnist center loss training and visualization☆75Updated 6 years ago
- Focal Loss of multi-classification in tensorflow☆79Updated 5 years ago
- Keras implementation of Octave Convolutions☆53Updated 5 years ago
- Transfer knowledge from a large DNN or an ensemble of DNNs into a small DNN☆29Updated 7 years ago
- Multi heads attention for image classification☆81Updated 6 years ago
- YOLO-v2, ResNet-32, GoogLeNet-lite☆35Updated 5 years ago
- RAdam optimizer for keras☆71Updated 5 years ago
- Focal Loss implementation by Keras with TensorFlow backend☆42Updated 6 years ago
- Cyclic learning rate TensorFlow implementation.☆66Updated 5 years ago
- wrapping a keras optimizer to implement gradient accumulation☆119Updated 4 years ago
- Ensembling ConvNets using Keras☆75Updated 5 years ago
- Multi-class classification with focal loss for imbalanced datasets☆82Updated 5 years ago
- Octave convolution☆34Updated 2 years ago
- Keras implementation of AutoAugment.☆30Updated 5 years ago
- resnet_cifar10_cifar100_imagenet☆13Updated 6 years ago
- lookahead optimizer for keras☆170Updated 5 years ago
- Center loss implementation in Keras☆41Updated 7 years ago
- Implementation of Learning Rate Finder, SGDR and Cyclical Learning Rate in Keras☆29Updated 6 years ago
- Text classification models: cnn, self-attention, cnn-rnf, rnn-att, capsule-net. TensorFlow. Single GPU or multi GPU☆19Updated 4 years ago
- Try to use tf.estimator and tf.data together to train a cnn model.☆79Updated 6 years ago
- Teaches a student network from the knowledge obtained via training of a larger teacher network☆157Updated 6 years ago
- It contains the Attention-56 and Attention-96 models built from scratch in keras. Residual Attention Networks are described in the paper …☆38Updated 6 years ago
- Model Compression Based on Geoffery Hinton's Logit Regression Method in Keras applied to MNIST 16x compression over 0.95 percent accuracy…☆63Updated 5 years ago
- keras implementation of AdamW from Fixing Weight Decay Regularization in Adam (https://arxiv.org/abs/1711.05101)☆70Updated 6 years ago
- Naive implementation of SENet in Keras☆135Updated 5 years ago
- DropBlock implemented in Keras☆26Updated 2 years ago