s-nlp / annotated-transformerLinks
http://nlp.seas.harvard.edu/2018/04/03/attention.html
☆62Updated 4 years ago
Alternatives and similar repositories for annotated-transformer
Users that are interested in annotated-transformer are comparing it to the libraries listed below
Sorting:
- Visualising the Transformer encoder☆111Updated 4 years ago
- Theoretical Deep Learning: generalization ability☆46Updated 5 years ago
- A small library with distillation, quantization and pruning pipelines☆26Updated 4 years ago
- Distillation of BERT model with catalyst framework☆78Updated 2 years ago
- A tiny Catalyst-like experiment runner framework on top of micrograd.☆51Updated 4 years ago
- XAI Tutorial for the Explainable AI track in the ALPS winter school 2021☆58Updated 4 years ago
- Russian RoBERTa☆29Updated 5 years ago
- A 🤗-style implementation of BERT using lambda layers instead of self-attention☆69Updated 4 years ago
- (re)Implementation of Learning Multi-level Dependencies for Robust Word Recognition☆17Updated 10 months ago
- LM Pretraining with PyTorch/TPU☆134Updated 5 years ago
- Pytorch library for end-to-end transformer models training, inference and serving☆70Updated 2 months ago
- Create interactive textual heat maps for Jupiter notebooks☆196Updated last year
- ☆103Updated 4 years ago
- ☆21Updated 6 years ago
- Code for the Shortformer model, from the ACL 2021 paper by Ofir Press, Noah A. Smith and Mike Lewis.☆147Updated 3 years ago
- diagNNose is a Python library that facilitates a broad set of tools for analysing hidden activations of neural models.☆82Updated last year
- DEREK (Domain Entities and Relations Extraction Kit)☆10Updated 2 years ago
- ☆64Updated 5 years ago
- RuREBus shared task repo☆30Updated 4 years ago
- A supplementary code for Beyond Vector Spaces: Compact Data Representation as Differentiable Weighted Graphs.☆47Updated 5 years ago
- Code for scaling Transformers☆26Updated 4 years ago
- Implementation of https://arxiv.org/abs/1904.00962☆15Updated 5 years ago
- What are the best Systems? New Perspectives on NLP Benchmarking☆13Updated 2 years ago
- Silly twitter torch implementations.☆46Updated 2 years ago
- ☆75Updated 5 years ago
- Lightweight knowledge distillation pipeline☆28Updated 3 years ago
- On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines☆136Updated last year
- My implementation of DeepMind's Perceiver☆63Updated 4 years ago
- nlp workshop at datafest siberia 2019☆22Updated 2 years ago
- Code for MSID, a Multi-Scale Intrinsic Distance for comparing generative models, studying neural networks, and more!☆51Updated 6 years ago