x39826 / Pali_Tripitaka
Pali Buddhist scriptures of 15 countries and its parallel corpus
☆9Updated 5 years ago
Related projects: ⓘ
- ✒️ དག་བྱེད། Dakje, improving your spelling and readability☆11Updated 2 years ago
- Tools for extracting parallel corpora from article titles across languages in Wikipedia☆72Updated 9 years ago
- A visual and interactive scoring environment for machine translation systems.☆32Updated 6 years ago
- ReVal: A Simple and Effective Machine Translation Evaluation Metric Based on Recurrent Neural Networks☆9Updated 5 years ago
- Transform TMX to text☆29Updated last year
- Tool to fix bitexts and tag near-duplicates for removal☆29Updated last month
- ☆42Updated 6 years ago
- An Interactive Tool for Annotating Discourse Structure and Text Improvement☆16Updated 3 years ago
- ☆15Updated this week
- ☆12Updated 8 years ago
- Linguistically analyzed Classical Tibetan texts☆23Updated 3 years ago
- FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.g…☆110Updated 2 months ago
- Exploring the idea of a generic, language agnostic, CEFR level classifier☆20Updated 6 years ago
- A re-implementation of redpony/cdec's tokenize-anything.pl script in python☆8Updated 8 years ago
- Improving Low-Resource Neural Machine Translation of Related Languages by Transfer Learning☆17Updated last year
- Data collection, alignment and TAUS repository☆20Updated 6 years ago
- Sanskrit compound segmentation using seq2seq model☆23Updated 5 years ago
- A BiRNN framework implemented in Python and TensorFlow to extract parallel sentences from aligned comparable corpora.☆33Updated 6 years ago
- Microsoft Speech Language Translation (MSLT) Corpus☆19Updated 7 years ago
- Bilingual sentence similarity classifier using Tensorflow☆19Updated 4 years ago
- Tools for filtering and cleaning parallel and monolingual corpora for machine translation and other natural language processing tasks.☆40Updated 9 months ago
- The zhong [|] Chinese grammars☆13Updated 3 years ago
- A simple neural truecaser written in pytorch and allennlp.☆31Updated 3 months ago
- ☆47Updated 5 years ago
- ☆67Updated last month
- bilingual dictionary extractor from parallel corpora☆21Updated 10 years ago
- Tokenizer POS-Tagger and Dependency-parser with BERT/RoBERTa/DeBERTa models for Japanese and other languages☆46Updated last week
- A database of number names for 186 languages, locales, and scripts☆66Updated last year
- STREUSLE: a corpus with comprehensive lexical semantic annotation (multiword expressions, supersenses)☆63Updated last year
- ☆26Updated 7 years ago