mingruimingrui / ICU-tokenizer
ICU based universal language tokenizer
☆30Updated 3 years ago
Alternatives and similar repositories for ICU-tokenizer:
Users that are interested in ICU-tokenizer are comparing it to the libraries listed below
- ☆42Updated 4 years ago
- ☆36Updated 2 years ago
- ☆66Updated 3 years ago
- We are creating a challenging new benchmark MultiReQA: A Cross-Domain Evaluation for Retrieval Question Answering Models. Retrieval quest…☆31Updated 4 years ago
- EMNLP 2021 Tutorial: Multi-Domain Multilingual Question Answering☆38Updated 3 years ago
- X-BERT: eXtreme Multi-label Text Classification with BERT☆52Updated 5 years ago
- codes and pre-trained models of paper "Segatron: Segment-aware Transformer for Language Modeling and Understanding"☆18Updated 2 years ago
- Source code for our AAAI 2020 paper P-SIF: Document Embeddings using Partition Averaging☆34Updated 4 years ago
- ☆46Updated 3 years ago
- SNCSE: Contrastive Learning for Unsupervised Sentence Embedding with Soft Negative Samples☆75Updated 2 years ago
- Multilingual abstractive summarization dataset extracted from WikiHow.☆85Updated 3 years ago
- Pytorch-version BERT-flow: One can apply BERT-flow to any PLM within Pytorch framework.☆72Updated 3 years ago
- ☆21Updated 3 years ago
- ☆85Updated 3 years ago
- 🦮 Code and pretrained models for Findings of ACL 2022 paper "LaPraDoR: Unsupervised Pretrained Dense Retriever for Zero-Shot Text Retrie…☆49Updated 2 years ago
- Thank you BART! Rewarding Pre-Trained Models Improves Formality Style Transfer (ACL 2021)☆30Updated 2 years ago
- ☆55Updated last year
- The code for EMNLP2022 paper "Improved grammatical error correction by ranking elementary edits"☆19Updated 2 years ago
- a large scientific paraphrase dataset for longer paraphrase generation☆38Updated 2 years ago
- Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding (AAAI 2020) - PyTorch Implementation☆31Updated last year
- This repo supports various cross-lingual transfer learning & multilingual NLP models.☆92Updated last year
- ☆68Updated 3 years ago
- Code for ACL 2022 paper "Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation"☆30Updated 2 years ago
- A text augmentation tool for named entity recognition.☆52Updated 3 years ago
- ☆92Updated 3 years ago
- Code for ACL2021 paper: "GLGE: A New General Language Generation Evaluation Benchmark"☆58Updated 2 years ago
- Implementation of pQRNN in PyTorch☆46Updated 3 years ago
- This is the code for the EMNLP2020 Finding paper "BERT for Monolingual and Cross-Lingual Reverse Dictionary"☆19Updated 4 years ago
- Graph parsing approach to structured sentiment analysis.☆41Updated last year
- AAAI-20 paper: Cross-Lingual Natural Language Generation via Pre-Training☆129Updated 3 years ago