AAUThematic4LT / Parallel-Corpora-for-Ethiopian-LanguagesLinks
☆15Updated 5 years ago
Alternatives and similar repositories for Parallel-Corpora-for-Ethiopian-Languages
Users that are interested in Parallel-Corpora-for-Ethiopian-Languages are comparing it to the libraries listed below
Sorting:
- Pre-process arabic text (remove diacritics, punctuations and repeating characters)☆107Updated 8 years ago
- Natural Language Processing in Ethiopian Languages: Current State, Challenges, and Opportunities☆14Updated 3 months ago
- Spanish Billion Word Corpus and Embeddings☆48Updated 2 years ago
- BERT for Arabic Topic Modeling: An Experimental Study on BERTopic Technique☆28Updated 4 years ago
- Arabic support for textblob☆85Updated 3 years ago
- A repository for publicly/freely available Natural Language Processing (NLP) datasets for African languages.☆109Updated last year
- ☆45Updated 3 years ago
- Use Python and NLTK to build out your own text classifiers and solve common NLP problems☆50Updated 5 years ago
- Arabic data☆14Updated last month
- ☆12Updated 6 years ago
- ☆14Updated 2 years ago
- Code and models for "The Interplay of Variant, Size, and Task Type in Arabic Pre-trained Language Models". EACL 2021, WANLP.☆50Updated last year
- Automatically extract grammatical edits from parallel original and corrected sentences.☆11Updated 8 years ago
- Dutch word embeddings, trained on a large collection of Dutch social media messages and news/blog/forum posts.☆45Updated 3 years ago
- Interactive Jupyter Notebooks for learning materials☆53Updated 3 years ago
- Machine Translation for Africa☆295Updated 3 years ago
- Jojajovai Guarani-Spanish Parallel Corpus☆16Updated 3 years ago
- Machine translation (MT) benchmark dataset for languages in the Horn of Africa.☆40Updated 2 years ago
- A sentence segmenter that actually works!☆305Updated 5 years ago
- Arabic edition of BERT pretrained language models☆131Updated 4 years ago
- Applying BERT to named entity recognition in English and Russian.☆162Updated 2 years ago
- 🤘Lemmy is a lemmatizer for Danish 🇩🇰 and Swedish 🇸🇪☆77Updated 3 years ago
- Python version for Doug Biber's Multidimensional Analysis (MDA)☆36Updated 6 months ago
- Advanced NLP Workshop: word-sense disambiguation with RoBERTa and text summarization with BART (Machine Learning Milan)☆27Updated 5 years ago
- open datasets for sentiment analysis based on tweets in English/Spanish/French/German/Italian☆73Updated 2 years ago
- Hindi POS Tags and keywords using TNT model. Created Date: 28 Sept 2018☆25Updated 5 years ago
- Indian Language Tagger and Chunker (Hindi, Telugu, Tamil, Marathi, Punjabi, Kanada, Malayalam, Urdu, Bengali)☆41Updated 2 years ago
- Data-driven projects repo☆74Updated 6 years ago
- A CoNLL-U parser that takes a CoNLL-U formatted string and turns it into a nested python dictionary.☆317Updated last month
- A neural parsing pipeline for segmentation, morphological tagging, dependency parsing and lemmatization with pre-trained models for more …☆113Updated last year