☆118Oct 15, 2025Updated 4 months ago
Alternatives and similar repositories for masakhane-ner
Users that are interested in masakhane-ner are comparing it to the libraries listed below
Sorting:
- Data, Embeddings, Stopword lists, code, and baselines for COLING 2020 paper titled "KINNEWS and KIRNEWS: Benchmarking Cross-Lingual Text …☆14Apr 26, 2024Updated last year
- POS for African languages☆19Jun 25, 2025Updated 8 months ago
- ☆17Jan 12, 2023Updated 3 years ago
- This is a repository for NaijaSenti. A Lacuna Funded Project for the development of sentiment corpus for four Nigerian languages: Igbo, H…☆36Oct 14, 2025Updated 4 months ago
- AfriBERTa: Exploring the Viability of Pretrained Multilingual Language Models for Low-resourced Languages☆82May 31, 2022Updated 3 years ago
- ☆12Mar 7, 2022Updated 4 years ago
- MasakhaNEWS: News Topic Classification for African Languages☆25May 12, 2024Updated last year
- Crosslingual Question Answering for African Languages☆30Sep 27, 2024Updated last year
- This repository contains multi-modal speech data for African languages that can be used to train ASR and NLP models☆17Aug 31, 2022Updated 3 years ago
- COMET for African languages☆10Jan 24, 2025Updated last year
- KnowMAN: Weakly Supervised Multinomial Adversarial Networks☆12Nov 9, 2021Updated 4 years ago
- Meta Representation Transformation for Low-resource Cross-lingual Learning☆41May 5, 2021Updated 4 years ago
- Machine Translation for Africa☆312Jun 14, 2022Updated 3 years ago
- MENYO-20k Corpus in "The Effect of Domain and Diacritics in Yorùbá-English Neural Machine Translation" in MT Summit 2021☆13Jan 16, 2023Updated 3 years ago
- A repository for publicly/freely available Natural Language Processing (NLP) datasets for African languages.☆114Apr 26, 2024Updated last year
- Code for "BERTifying the Hidden Markov Model for Multi-Source Weakly Supervised Named Entity Recognition"☆32Jun 20, 2023Updated 2 years ago
- Code for ACL 2022 paper "Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation"☆30Apr 2, 2022Updated 3 years ago
- AfriSenti-SemEval Shared Task 12: Sentiment Analysis for African languages : https://afrisenti-semeval.github.io/☆49Jan 10, 2024Updated 2 years ago
- Machine translation (MT) benchmark dataset for languages in the Horn of Africa.☆42Oct 13, 2022Updated 3 years ago
- All our community docs! Start here! Lets put Africa on the NLP Map☆67Apr 16, 2024Updated last year
- SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects☆23Jan 26, 2025Updated last year
- Multilingual Open Text☆25May 8, 2025Updated 10 months ago
- ☆43Jan 3, 2022Updated 4 years ago
- ☆11Jul 12, 2021Updated 4 years ago
- A parameter-efficient compression model architecture for a variety of NLP tasks at BERT level performance at a fraction of the computatio…☆10Jan 25, 2026Updated last month
- This repository contains the corpora and supplementary data, along with instructions for recreating the experiments, for our paper: "End-…☆90Feb 14, 2020Updated 6 years ago
- Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.☆105May 20, 2022Updated 3 years ago
- Towards developing a Robust Translation Model for African languages: Pilot Project FFR v1.0.☆44May 12, 2024Updated last year
- [Konvens21] This repository contains the DFKI MobIE Corpus, a dataset of 3,232 German-language documents that have been annotated with fi…☆12Sep 17, 2024Updated last year
- Getting Great Expectations setup to run on DataBricks with Spark Dataframes.☆13Jun 2, 2022Updated 3 years ago
- Crawling engine that crawls a set of top-level domains looking for documents in a list of languages☆11Feb 6, 2024Updated 2 years ago
- 🕸 GlotWeb: Web Indexing for Minority Languages (WWW 2026)☆17Feb 27, 2026Updated last week
- Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.☆88Sep 12, 2024Updated last year
- Source code for ACL 2022 paper "Self-contrastive Decorrelation for Sentence Embeddings".☆26Mar 10, 2025Updated 11 months ago
- Code for our paper accepted at EMNLP 2023 (Findings)☆14Jan 5, 2024Updated 2 years ago
- ☆12Dec 6, 2024Updated last year
- AllenNLP integration for Shiba: Japanese CANINE model☆12Jun 26, 2021Updated 4 years ago
- GlotEval: a unified evaluation toolkit designed to benchmark multilingual Large Language Models (LLMs) in a language-specific way☆18Nov 4, 2025Updated 4 months ago
- Notes and assignements of Self-Driving Cars Specialization from the University of Toronto on Coursera.☆17Dec 15, 2024Updated last year