masakhane-io / masakhanePreprocessorLinks
Building an effective preprocessing tool for African languages
☆13Updated last year
Alternatives and similar repositories for masakhanePreprocessor
Users that are interested in masakhanePreprocessor are comparing it to the libraries listed below
Sorting:
- Data, Embeddings, Stopword lists, code, and baselines for COLING 2020 paper titled "KINNEWS and KIRNEWS: Benchmarking Cross-Lingual Text …☆13Updated last year
- MasakhaNEWS: News Topic Classification for African Languages☆23Updated last year
- AfriBERTa: Exploring the Viability of Pretrained Multilingual Language Models for Low-resourced Languages☆74Updated 3 years ago
- Crosslingual Question Answering for African Languages☆30Updated 8 months ago
- A simple library for segmenting legal texts☆17Updated 2 years ago
- ☆110Updated last year
- ☆12Updated 8 months ago
- MAFAND-MT☆55Updated 10 months ago
- A collection of textual datasets in Hausa language and the corresponding translation in English language.☆15Updated 4 years ago
- 🤗 Push your spaCy pipelines to the Hugging Face Hub☆44Updated last year
- spaCy match and replace, maintaining conjugation☆35Updated 2 years ago
- Named entity recognition for the legal domain☆42Updated 4 years ago
- ☆22Updated last year
- Next-generation Punkt sentence boundary detection with zero dependencies☆17Updated 2 months ago
- ☆22Updated 3 years ago
- This is a repository for NaijaSenti. A Lacuna Funded Project for the development of sentiment corpus for four Nigerian languages: Igbo, H…☆32Updated last year
- ☆18Updated last year
- A collection of notebooks for Natural Language Processing☆26Updated 4 months ago
- POS for African languages☆17Updated last year
- ☆14Updated 3 years ago
- COMET for African languages☆10Updated 4 months ago
- List of all the resources I developed in collaboration with LSV and Masakhane during my doctoral studies and beyond☆12Updated 2 years ago
- Using short models to classify long texts☆21Updated 2 years ago
- A Python package to get useful information from documents using TopicRank Algorithm.☆16Updated last year
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.☆31Updated 2 months ago
- A BERT-based application for reusable text classification at scale☆38Updated last year
- NeatText a simple NLP package for cleaning textual data and text preprocessing☆72Updated last year
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆63Updated last year
- All our community docs! Start here! Lets put Africa on the NLP Map☆60Updated last year
- AfriSenti-SemEval Shared Task 12: Sentiment Analysis for African languages : https://afrisenti-semeval.github.io/☆48Updated last year