lothelanor / actibLinks
This repository will soon contain all scripts and links to the annotated corpora of Tibetan.
☆13Updated 9 months ago
Alternatives and similar repositories for actib
Users that are interested in actib are comparing it to the libraries listed below
Sorting:
- A neural word aligner based on multilingual BERT☆359Updated 3 years ago
- The central repo for Creole based NLU and NLG work☆18Updated 6 months ago
- 🏷 བོད་ཏོག [pʰøtɔk̚] Tibetan word tokenizer in Python☆72Updated last week
- Facebook Low Resource (FLoRes) MT Benchmark☆755Updated 2 years ago
- A Neural Framework for MT Evaluation☆684Updated 2 months ago
- Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)☆382Updated 2 years ago
- Translation Memory Open-source Purifier☆35Updated 3 years ago
- Tools for filtering and cleaning parallel and monolingual corpora for machine translation and other natural language processing tasks.☆41Updated last year
- ☆32Updated this week
- A multilingual parallel corpus created from translations of the Bible.☆191Updated 6 months ago
- A tool for converting TMX files into bilingual corpora☆18Updated 5 years ago
- ☆363Updated last year
- Neural Machine Translation (NMT) tutorial. Data preprocessing, model training, evaluation, and deployment.☆171Updated last year
- Linguistically analyzed Classical Tibetan texts☆26Updated 4 years ago
- Improved Sentence Alignment in Linear Time and Space☆185Updated 2 years ago
- ☆14Updated last year
- Easier Automatic Sentence Simplification Evaluation☆162Updated 2 years ago
- Efficient Low-Memory Aligner☆146Updated 10 months ago
- Bitextor generates translation memories from multilingual websites☆297Updated last year
- Enterprise Scale NLP with Hugging Face & SageMaker Workshop series☆242Updated 2 years ago
- Machine Translation (MT) Preparation Scripts☆33Updated 6 months ago
- A tool that locates, downloads, and extracts machine translation corpora☆159Updated 2 months ago
- Resources and tools for Indian language Natural Language Processing☆618Updated last year
- A character-wise tokenizer for morphologically rich languages☆29Updated 2 months ago
- Data for the quantitative study of (Vedic) Sanskrit☆141Updated 3 months ago
- An initiative to collect and distribute resources for co-reference resolution in a unified standard.☆25Updated last year
- Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.☆160Updated last year
- Lucene analyzer for Tibetan☆12Updated last month
- Yet Another Neural Machine Translation Toolkit☆180Updated 8 months ago
- ☆14Updated 4 years ago