OpenPecha / BotokLinks
π· ΰ½ΰ½Όΰ½ΰΌΰ½ΰ½Όΰ½ [pΚ°ΓΈtΙkΜ] Tibetan word tokenizer in Python
β71Updated last month
Alternatives and similar repositories for Botok
Users that are interested in Botok are comparing it to the libraries listed below
Sorting:
- π¦ NLP for Tibetan, in Python.β37Updated 2 years ago
- Linguistically analyzed Classical Tibetan textsβ26Updated 4 years ago
- Machine-Translation-based sentence alignment tool for parallel textβ313Updated 4 years ago
- π Curated list of Tibetan NLP projectsβ41Updated 5 years ago
- repo for Tibetan corporaβ21Updated 2 years ago
- β18Updated 8 years ago
- TIP-LAS: An open source toolkit for Tibetan word segmentation and part-of-speech taggingβ82Updated 3 years ago
- Improved Sentence Alignment in Linear Time and Spaceβ185Updated 2 years ago
- Sentence alignerβ120Updated 4 years ago
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.β160Updated last year
- ERRor ANnotation Toolkit: Automatically extract and classify grammatical errors in parallel original and corrected sentences.β454Updated last year
- A neural word aligner based on multilingual BERTβ359Updated 3 years ago
- MaxMatch (M^2) Scorer - Evaluation program for grammatical error correction systems.β155Updated 3 years ago
- <u><a href="https://circse.github.io/LT4HALA/" style="color: white">Workshop on Language Technologies for Historical and Ancient Languageβ¦β34Updated last year
- Bitextor generates translation memories from multilingual websitesβ296Updated last year
- Fast + Non-Autoregressive Grammatical Error Correction using BERT. Code and Pre-trained models for paper "Parallel Iterative Edit Models β¦β231Updated 2 years ago
- β120Updated 5 years ago
- OpusFilter - Parallel corpus processing toolkitβ112Updated last week
- Scripts to preprocess training and test data and to run fast_align and gizaβ107Updated 4 years ago
- Repository of "An Empirical Study of Incorporating Pseudo Data into Grammatical Error Correction" (EMNLP-IJCNLP 2019)β68Updated 5 years ago
- A grammatical error correction reading list maintained by Beijing Language and Culture University Natural Language Processing Groupβ24Updated 4 years ago
- Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)β381Updated 2 years ago
- Transformer based translation quality estimationβ114Updated 2 years ago
- Tokenizer POS-tagger and Dependency-parser for Classical Chineseβ66Updated last month
- Efficient Low-Memory Alignerβ146Updated 10 months ago
- Multilingual sentence alignment using sentence embeddingsβ130Updated last year
- β167Updated 3 years ago
- We present a list of languages with their codes, families, regions and etc. We also present a list of multi-lingual corpora (with urls).β86Updated 4 years ago
- Simple, fast unsupervised word alignerβ760Updated 3 years ago
- Source code for paper: Improving Grammatical Error Correction via Pre-Training a Copy-Augmented Architecture with Unlabeled Dataβ251Updated 5 years ago