greentfrapp / attention-primerLinks
A demonstration of the attention mechanism with some toy experiments and explanations.
☆108Updated 6 years ago
Alternatives and similar repositories for attention-primer
Users that are interested in attention-primer are comparing it to the libraries listed below
Sorting:
- Configure Python functions explicitly and safely☆126Updated 8 months ago
- Training Transformer-XL on 128 GPUs☆140Updated 5 years ago
- ☆28Updated 6 years ago
- Pip-installable differentiable stacks in PyTorch!☆65Updated 4 years ago
- learning to search in pytorch☆110Updated 5 years ago
- Probabilistic classification in PyTorch/TensorFlow/scikit-learn with Fenchel-Young losses☆186Updated last year
- Experiment orchestration☆103Updated 5 years ago
- ☆153Updated 5 years ago
- Torchélie is a set of utility functions, layers, losses, models, trainers and other things for PyTorch.☆110Updated 7 months ago
- Python implementation of GLN in different frameworks☆97Updated 4 years ago
- Code for Neural Arithmetic Units (ICLR) and Measuring Arithmetic Extrapolation Performance (SEDL|NeurIPS)☆146Updated 3 years ago
- ☆103Updated 4 years ago
- The Annotated Encoder Decoder with Attention☆166Updated 4 years ago
- ☆64Updated 5 years ago
- Tricks for Colab power users☆168Updated 5 years ago
- PyTorch implementation of the NIPS'17 paper Training Deep Networks without Learning Rates Through Coin Betting.☆37Updated 7 years ago
- A tool to monitor everything you want. Clean, simple, extensible and in one place.☆82Updated 3 years ago
- Creates a learning-curve plot for Jupyter/Colab notebooks that is updated in real-time.☆177Updated 3 years ago
- TBA☆76Updated 6 years ago
- ☆45Updated 5 years ago
- Synthetic book pages created with a PGGAN☆73Updated 6 years ago
- PyTorch functions and utilities to make your life easier☆195Updated 4 years ago
- Pytorch Cheatsheet☆90Updated 6 years ago
- Mixture Density Networks (Bishop, 1994) tutorial in JAX☆60Updated 5 years ago
- Asynchronous Distributed Hyperparameter Optimization.☆299Updated 6 months ago
- Official Tensorflow implementation of the paper "Y-Autoencoders: disentangling latent representations via sequential-encoding", Pattern R…☆52Updated 4 years ago
- Loss Patterns of Neural Networks☆85Updated 3 years ago
- A generative modelling toolkit for PyTorch.☆70Updated 3 years ago
- Code for the Shortformer model, from the ACL 2021 paper by Ofir Press, Noah A. Smith and Mike Lewis.☆147Updated 4 years ago
- Original PyTorch implementation of the Leap meta-learner (https://arxiv.org/abs/1812.01054) along with code for running the Omniglot expe…☆148Updated 2 years ago