gunchagarg / differential-learning-rate-keras
Implementation of Differential Learning Rate in Keras
☆11Updated 5 years ago
Alternatives and similar repositories for differential-learning-rate-keras:
Users that are interested in differential-learning-rate-keras are comparing it to the libraries listed below
- Exploring learning rates to improve model performance☆19Updated 5 years ago
- This repository contains notebooks showing how to perform mixed precision training in tf.keras 2.0☆12Updated 5 years ago
- Large Scale BERT Distillation☆32Updated last year
- Minimalistic TensorFlow2+ deep metric/similarity learning library with loss functions, miners, and utils as embedding projector.☆37Updated 2 years ago
- Radam+lookahead implemented by tensorflow☆11Updated 5 years ago
- Source code for "Training Generative Adversarial Networks Via Turing Test".☆13Updated 4 years ago
- Prabhupadavani: A Code-mixed Speech Translation Data for 25 languages☆13Updated 2 years ago
- ☆27Updated 5 years ago
- Collection of models and extensions for deployment in PyTorch☆24Updated 2 years ago
- Pytorch Code for S2IGAN☆41Updated 4 years ago
- ☆24Updated 3 years ago
- Adaptive embedding and softmax☆17Updated 3 years ago
- ☆20Updated 5 years ago
- bumble bee transformer☆14Updated 3 years ago
- Implementation of COCO-LM, Correcting and Contrasting Text Sequences for Language Model Pretraining, in Pytorch☆45Updated 3 years ago
- Enable RNNLM lattice rescoring with Pytorch [kaldi]☆12Updated 4 years ago
- Unsupervised Anomaly Detection via Deep Metric Learning with End-to-End Optimization☆13Updated last year
- Contrastive Language-Audio Pretraining☆15Updated 3 years ago
- Comprehensive Python library for speech and voice.☆33Updated 2 years ago
- Codes for Category-aware Generative Adversarial Networks (AAAI 2020)☆18Updated 4 years ago
- A Multi-Format Transfer Learning Model for Event Argument Extraction via Variational Information Bottleneck☆10Updated 2 years ago
- A very naive and simple benchmark between dlib and pytorch in terms of space and time☆19Updated 4 years ago
- A Keras implementation of Adaptive Softmax☆7Updated 6 years ago
- Why Attention? Analyze BiLSTM Deficiency and Its Remedies in the Case of NER☆9Updated 5 years ago
- Local Attention - Flax module for Jax☆20Updated 3 years ago
- Code for our paper: "Regularity Normalization: Neuroscience-Inspired Unsupervised Attention across Neural Network Layers".☆21Updated 3 years ago
- Anonymous ICLR Submission☆14Updated 5 years ago
- Implementation for NATv2.☆23Updated 4 years ago
- Implementation of the retriever distillation procedure as outlined in the paper "Distilling Knowledge from Reader to Retriever"☆32Updated 4 years ago
- Implementing activation functions from scratch in Tensorflow.☆36Updated 3 years ago