mingruimingrui / ICU-tokenizer
ICU based universal language tokenizer
☆30Updated 3 years ago
Alternatives and similar repositories for ICU-tokenizer:
Users that are interested in ICU-tokenizer are comparing it to the libraries listed below
- ☆86Updated 3 years ago
- ☆42Updated 4 years ago
- This is the code for the EMNLP2020 Finding paper "BERT for Monolingual and Cross-Lingual Reverse Dictionary"☆19Updated 4 years ago
- Code for ACL 2022 paper "Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation"☆30Updated 2 years ago
- 🦮 Code and pretrained models for Findings of ACL 2022 paper "LaPraDoR: Unsupervised Pretrained Dense Retriever for Zero-Shot Text Retrie…☆49Updated 2 years ago
- EMNLP 2021 Tutorial: Multi-Domain Multilingual Question Answering☆38Updated 3 years ago
- ☆36Updated 2 years ago
- Code, data, and pretrained models for the paper "Generating Wikipedia Article Sections from Diverse Data Sources"☆20Updated 4 years ago
- ☆16Updated 4 years ago
- The official code of the "Frustratingly Easy System Combination for Grammatical Error Correction" paper☆56Updated last year
- Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding (AAAI 2020) - PyTorch Implementation☆31Updated last year
- The code for EMNLP2022 paper "Improved grammatical error correction by ranking elementary edits"☆19Updated 2 years ago
- We are creating a challenging new benchmark MultiReQA: A Cross-Domain Evaluation for Retrieval Question Answering Models. Retrieval quest…☆31Updated 4 years ago
- Tower Parse: Low-Resource Dependency Parsing via Hierarchical Source Selection☆15Updated 3 years ago
- ☆21Updated 3 years ago
- ☆22Updated 3 years ago
- Gradient accumulation on tf.estimator☆12Updated 4 years ago
- a Fairseq fork for sequence tagging/labeling tasks☆31Updated 4 years ago
- evaluation suite for testing automatic grammatical error corrections☆38Updated 7 years ago
- ReConsider is a re-ranking model that re-ranks the top-K (passage, answer-span) predictions of an Open-Domain QA Model like DPR (Karpukhi…☆49Updated 3 years ago
- Code for the paper "A Theoretical Analysis of the Repetition Problem in Text Generation" in AAAI 2021.☆52Updated 2 years ago
- NoiseMix - data generation for natural language☆40Updated 6 years ago
- BERT models for many languages created from Wikipedia texts☆33Updated 4 years ago
- ☆92Updated 3 years ago
- AAAI-20 paper: Cross-Lingual Natural Language Generation via Pre-Training☆129Updated 3 years ago
- EMNLP 2021 - CTC: A Unified Framework for Evaluating Natural Language Generation☆96Updated 2 years ago
- Stronger Baselines for Grammatical Error Correction Using a Pretrained Encoder-Decoder Model.☆37Updated last year
- Fine-tuned Transformers compatible BERT models for Sequence Tagging☆40Updated 4 years ago
- ☆68Updated 3 years ago
- CharBERT: Character-aware Pre-trained Language Model (COLING2020)☆120Updated 4 years ago