KoichiYasuoka / esuparLinks
Tokenizer POS-Tagger and Dependency-parser with BERT/RoBERTa/DeBERTa/GPT models for Japanese and other languages
☆54Updated 4 months ago
Alternatives and similar repositories for esupar
Users that are interested in esupar are comparing it to the libraries listed below
Sorting:
- GrammarTagger — A Neural Multilingual Grammar Profiler for Language Learning☆31Updated 4 years ago
- Tokenizer POS-tagger and Dependency-parser for Classical Chinese☆14Updated 3 weeks ago
- OpusFilter - Parallel corpus processing toolkit☆115Updated this week
- TUFS Asian Language Parallel Corpus☆52Updated 2 years ago
- A accurate multilingual word aligner based on LaBSE☆24Updated 2 years ago
- X-SCITLDR: Cross-Lingual Extreme Summarization of Scholarly Documents (JCDL 2022)☆14Updated 3 years ago
- Utility scripts for preprocessing Wikipedia texts for NLP☆78Updated last year
- ☆34Updated 2 years ago
- The Business Scene Dialogue corpus☆72Updated 4 years ago
- Tokenizer POS-tagger and Dependency-parser for Classical Chinese☆69Updated 3 months ago
- Multilingual sentence alignment using sentence embeddings☆138Updated last year
- cLang-8 is a dataset for grammatical error correction.☆112Updated 3 years ago
- Code for paper "Kanbun-LM: Reading and Translating Classical Chinese in Japanese Method by Language Models"☆21Updated 2 years ago
- Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation☆15Updated last year
- An example usage of JParaCrawl pre-trained Neural Machine Translation (NMT) models.☆105Updated 4 years ago
- A library for evaluation of Grammatical Error Correction (GEC). Accepted to ACL'25 Demo: "gec-metrics: A Unified Library for Grammatical …☆14Updated 5 months ago
- Repository for the paper "MultiNERD: A Multilingual, Multi-Genre and Fine-Grained Dataset for Named Entity Recognition (and Disambiguatio…☆45Updated last year
- Data and evaluation code for the paper WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER (EMNLP 2…☆70Updated 2 years ago
- Code accompanying the submission "Structural Text Segmentation of Legal Documents" by Aumiller et al.☆98Updated 2 years ago
- Code and models used in "MUSS Multilingual Unsupervised Sentence Simplification by Mining Paraphrases".☆99Updated 2 years ago
- allennlp-light is a port of AllenNLP's core modules and nn portions into a standalone package with minimum dependencies☆56Updated 3 years ago
- Repository to collect and categorize Grammatical Error Correction papers.☆123Updated 5 months ago
- SciWING is a modern toolkit for scientific document processing from WING-NUS☆63Updated 2 years ago
- MAGPIE: A sense-annotated corpus of potentially idiomatic expressions☆30Updated 5 years ago
- ☆24Updated last year
- Implementation of "SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource Languages" paper, accepted to E…☆25Updated 3 years ago
- Japanese data from the Google UDT 2.0.☆38Updated 2 months ago
- mSimCSE: Multilingual SimCSE☆34Updated 3 years ago
- Tokenizer POS-tagger and Dependency-parser for Classical Chinese☆19Updated last week
- ICU based universal language tokenizer☆33Updated 4 years ago