UBC-NLP / afrolidLinks
AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.
โ32Updated 7 months ago
Alternatives and similar repositories for afrolid
Users that are interested in afrolid are comparing it to the libraries listed below
Sorting:
- ๐ Resource and Tool for Writing System Identification -- LREC 2024โ20Updated last year
- โ49Updated last year
- NTREX -- News Test References for MT Evaluationโ85Updated last year
- OpusCleaner is a web interface that helps you select, clean and schedule your data for training machine translation models.โ52Updated 3 weeks ago
- ๐ธ GlotWeb: Web Indexing for Low-Resource Languages -- under construction.โ15Updated 2 months ago
- Pipeline component for spaCy (and other spaCy-wrapped parsers such as spacy-stanza and spacy-udpipe) that adds CoNLL-U properties to a Doโฆโ82Updated last year
- Creating super-parallel corpora of more than 1500+ unique languages for NLP researchโ34Updated 2 years ago
- ParaNames: A multilingual resource for parallel namesโ37Updated last year
- A survey of corpora for Germanic low-resource languages and dialectsโ25Updated 10 months ago
- Repository accompanying "An Open Dataset and Model for Language Identification" (Burchell et al., 2023)โ75Updated 6 months ago
- Multilingual Open Textโ25Updated 5 months ago
- โ113Updated 2 weeks ago
- Curriculum trainingโ18Updated 4 months ago
- Augmenty is an augmentation library based on spaCy for augmenting texts.โ156Updated last year
- Code and models used in "MUSS Multilingual Unsupervised Sentence Simplification by Mining Paraphrases".โ99Updated 2 years ago
- Statistics on multilingual datasetsโ17Updated 3 years ago
- These are lists for a variety of languages containing words that are distinctive to each language.โ38Updated 3 years ago
- A tiny BERT for low-resource monolingual modelsโ31Updated 3 weeks ago
- ๐งช Cutting-edge experimental spaCy components and featuresโ102Updated last year
- Machine translation (MT) benchmark dataset for languages in the Horn of Africa.โ40Updated 3 years ago
- OpusFilter - Parallel corpus processing toolkitโ110Updated 3 weeks ago
- โ64Updated 2 years ago
- A repository for publicly/freely available Natural Language Processing (NLP) datasets for African languages.โ111Updated last year
- Bilingual term extractorโ58Updated this week
- Python Finite-State Toolkitโ58Updated last week
- A module to compute textual lexical richness (aka lexical diversity).โ111Updated 2 years ago
- This repository contains the HiNER dataset released with our paper at LREC 2022โ15Updated 2 years ago
- โ45Updated 3 years ago
- โ20Updated 3 years ago
- A Word Sense Disambiguation system integrating implicit and explicit external knowledge.โ69Updated 4 years ago