lothelanor / actibLinks

This repository will soon contain all scripts and links to the annotated corpora of Tibetan.

☆13

Alternatives and similar repositories for actib

Users that are interested in actib are comparing it to the libraries listed below

Sorting:

neulab / awesome-align
A neural word aligner based on multilingual BERT
☆359Updated 3 years ago
hclent / CreoleVal
The central repo for Creole based NLU and NLG work
☆18Updated 6 months ago
OpenPecha / Botok
🏷 བོད་ཏོག [pʰøtɔk̚] Tibetan word tokenizer in Python
☆72Updated last week
facebookresearch / flores
Facebook Low Resource (FLoRes) MT Benchmark
☆755Updated 2 years ago
Unbabel / COMET
A Neural Framework for MT Evaluation
☆684Updated 2 months ago
cisnlp / simalign
Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)
☆382Updated 2 years ago
hlt-mt / TMOP
Translation Memory Open-source Purifier
☆35Updated 3 years ago
M4t1ss / parallel-corpora-tools
Tools for filtering and cleaning parallel and monolingual corpora for machine translation and other natural language processing tasks.
☆41Updated last year
awslabs / llmeter
☆32Updated this week
christos-c / bible-corpus
A multilingual parallel corpus created from translations of the Bible.
☆191Updated 6 months ago
amake / TMX2Corpus
A tool for converting TMX files into bilingual corpora
☆18Updated 5 years ago
amazon-science / alexa-teacher-models
☆363Updated last year
ymoslem / OpenNMT-Tutorial
Neural Machine Translation (NMT) tutorial. Data preprocessing, model training, evaluation, and deployment.
☆171Updated last year
tibetan-nlp / classical-tibetan-corpus
Linguistically analyzed Classical Tibetan texts
☆26Updated 4 years ago
thompsonb / vecalign
Improved Sentence Alignment in Linear Time and Space
☆185Updated 2 years ago
aws-samples / sagemaker-bencher
☆14Updated last year
feralvam / easse
Easier Automatic Sentence Simplification Evaluation
☆162Updated 2 years ago
robertostling / eflomal
Efficient Low-Memory Aligner
☆146Updated 10 months ago
bitextor / bitextor
Bitextor generates translation memories from multilingual websites
☆297Updated last year
philschmid / huggingface-sagemaker-workshop-series
Enterprise Scale NLP with Hugging Face & SageMaker Workshop series
☆242Updated 2 years ago
ymoslem / MT-Preparation
Machine Translation (MT) Preparation Scripts
☆33Updated 6 months ago
thammegowda / mtdata
A tool that locates, downloads, and extracts machine translation corpora
☆159Updated 2 months ago
anoopkunchukuttan / indic_nlp_library
Resources and tools for Indian language Natural Language Processing
☆618Updated last year
amir-zeldes / RFTokenizer
A character-wise tokenizer for morphologically rich languages
☆29Updated 2 months ago
OliverHellwig / sanskrit
Data for the quantitative study of (Vedic) Sanskrit
☆141Updated 3 months ago
UniversalAnaphora / UniversalAnaphora
An initiative to collect and distribute resources for co-reference resolution in a unified standard.
☆25Updated last year
bitextor / bicleaner
Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.
☆160Updated last year
buda-base / lucene-bo
Lucene analyzer for Tibetan
☆12Updated last month
prajdabre / yanmtt
Yet Another Neural Machine Translation Toolkit
☆180Updated 8 months ago
nlpcuom / English-Tamil-Parallel-Corpus
☆14Updated 4 years ago