YanaiEliyahu / AdasOptimizerLinks
ADAS is short for Adaptive Step Size, it's an optimizer that unlike other optimizers that just normalize the derivative, it fine-tunes the step size, truly making step size scheduling obsolete, achieving state-of-the-art training performance
☆85Updated 4 years ago
Alternatives and similar repositories for AdasOptimizer
Users that are interested in AdasOptimizer are comparing it to the libraries listed below
Sorting:
- ☆77Updated 11 months ago
- Electra pre-trained model using Vietnamese corpus☆67Updated last year
- Deep Learning project template best practices with Pytorch Lightning, Hydra, Tensorboard.☆159Updated 4 years ago
- Knowledge Distillation Toolkit☆88Updated 4 years ago
- ☆14Updated 4 years ago
- Auto-Magical Deploy AI model at large scale, high performance, and easy to use☆66Updated last year
- graftr: an interactive shell to view and edit PyTorch checkpoints.☆113Updated 4 years ago
- Multilingual bert retrained on news + squad2 for vietnamese☆24Updated 5 years ago
- PhoMT: A High-Quality and Large-Scale Benchmark Dataset for Vietnamese-English Machine Translation (EMNLP 2021)☆43Updated this week
- Creating a chatbot from your facebook data with GPT☆23Updated 3 years ago
- Collection of the latest, greatest, deep learning optimizers (for Pytorch) - CNN, NLP suitable☆215Updated 4 years ago
- Implementation of Feedback Transformer in Pytorch☆107Updated 4 years ago
- Zalo AI Challenge 2020 - Top 2 @ Voice Verification☆15Updated 2 years ago
- Light Face Detection using PyTorch Lightning☆84Updated last year
- ☆91Updated 4 years ago
- Pre-trained NFNets with 99% of the accuracy of the official paper "High-Performance Large-Scale Image Recognition Without Normalization".☆159Updated 4 years ago
- TF2 implementation of knowledge distillation using the "function matching" hypothesis from https://arxiv.org/abs/2106.05237.☆87Updated 3 years ago
- ☆18Updated 2 years ago
- Pre-trained NFNets with 99% of the accuracy of the official paper "High-Performance Large-Scale Image Recognition Without Normalization".…☆30Updated 4 years ago
- Create SSH tunel to a running colab notebook☆67Updated 3 years ago
- Implementation for paper MLP-Mixer: An all-MLP Architecture for Vision☆91Updated 3 years ago
- General template for my PyTorch projects.☆18Updated 2 years ago
- Submission for AIviVN Vietnamese diacritics restoration contest https://www.aivivn.com/contests/3☆39Updated 10 months ago
- PhoNLP: A BERT-based multi-task learning model for part-of-speech tagging, named entity recognition and dependency parsing (NAACL 2021)☆142Updated 5 months ago
- Lite Inference Toolkit (LIT) for PyTorch☆161Updated 3 years ago
- BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese (INTERSPEECH 2022)☆103Updated 10 months ago
- EfficientNet, MobileNetV3, MobileNetV2, MixNet, etc in JAX w/ Flax Linen and Objax☆128Updated last year
- Cyclemoid implementation for PyTorch☆89Updated 3 years ago
- PyTorch dataset extended with map, cache etc. (tensorflow.data like)☆329Updated 2 years ago
- ☆54Updated 4 years ago