timoschick / one-token-approximation
This repository contains the code for applying One-Token Approximation to a pretrained language model using subword-level tokenization.
☆11Updated 4 years ago
Alternatives and similar repositories for one-token-approximation:
Users that are interested in one-token-approximation are comparing it to the libraries listed below
- Research code for the paper "How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models"☆26Updated 3 years ago
- ☆46Updated 5 years ago
- EMNLP 2021 Tutorial: Multi-Domain Multilingual Question Answering☆38Updated 3 years ago
- Analyzing mBERT's multilinguality in a small laboratory setting☆13Updated last year
- We are creating a challenging new benchmark MultiReQA: A Cross-Domain Evaluation for Retrieval Question Answering Models. Retrieval quest…☆31Updated 4 years ago
- ☆24Updated last year
- Statistics on multilingual datasets☆17Updated 2 years ago
- ☆11Updated 4 years ago
- Cross-lingual TRansfer Evaluation of Multilingual Encoders (XTREME)☆22Updated 5 years ago
- Code and data for: Low Resource Grammatical Error Correction Using Wikipedia Edits (WNUT 2018)☆16Updated 9 months ago
- This repository contains the code for "BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Representations".☆63Updated 4 years ago
- A coreference evaluation package for the CoNLL and ARRAU datasets☆40Updated 4 years ago
- ☆25Updated last year
- Fine-tuned Transformers compatible BERT models for Sequence Tagging☆40Updated 4 years ago
- ☆16Updated 3 years ago
- Codebase for probing and visualizing multilingual models.☆48Updated 4 years ago
- ☆68Updated 3 years ago
- Automatically detect errors in annotated corpora.☆47Updated last year
- Code and Data for Evaluation WG☆41Updated 2 years ago
- PyTorch implementation of NAACL 2021 paper "Multi-view Subword Regularization"☆25Updated 3 years ago
- GMEG☆29Updated 5 months ago
- This repository contains the code for the Form-Context Model and its Attentive Mimicking variant.☆31Updated 4 years ago
- Official code for the paper "PERL: Pivot-based Domain Adaptation for Pre-trained Deep Contextualized Embedding Models".☆16Updated 2 years ago
- Python code for training models in the ACL paper, "Simple and Effective Paraphrastic Similarity from Parallel Translations".☆22Updated 5 years ago
- Pretraining scripts for BART transformer model☆11Updated last year
- Code for our TSD paper "TOKEN is a MASK: Few-shot Named Entity Recognition with Pre-trained Language Models"☆14Updated 2 years ago
- ☆31Updated 4 years ago
- Syntactic evaluation sets, attribute-varying grammars, and code for replicating the CLAMS paper. ACL 2020.☆16Updated 4 months ago
- RelEx - A simple framework for Relation Extraction built on AllenNLP☆16Updated 4 years ago
- Code base for the EMNLP 2021 Findings paper: Cartography Active Learning☆14Updated last year