AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.
☆39Feb 5, 2026Updated 4 months ago
Alternatives and similar repositories for afrolid
Users that are interested in afrolid are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [WWW 2026] 🕸 GlotWeb: Web Indexing for Minority Languages☆17Apr 14, 2026Updated 2 months ago
- COMET for African languages☆11Jan 24, 2025Updated last year
- Bayesian Assessment of Hypotheses☆26Jul 6, 2023Updated 2 years ago
- Repository for "Self-Distillation for Model Stacking Unlocks Cross-Lingual NLU in 200+ Languages"☆15Oct 4, 2024Updated last year
- Neural Machine Translation for South African Languages☆40Dec 8, 2022Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- 🔢 Work with static vector models☆39Apr 21, 2025Updated last year
- Targetted language identifier, based on FastText and Hunspell.☆38Sep 4, 2025Updated 10 months ago
- Statistics on multilingual datasets☆17Jul 12, 2022Updated 3 years ago
- Evaluate language models using multiple choice items☆13Mar 6, 2026Updated 3 months ago
- ☆12Jan 2, 2024Updated 2 years ago
- ☆10May 11, 2024Updated 2 years ago
- NTREX -- News Test References for MT Evaluation☆87Jun 5, 2024Updated 2 years ago
- Semantically Search Emojis From the Command Line!☆13Nov 26, 2023Updated 2 years ago
- LLM-only topic extraction and classification☆11Jun 3, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- TURJUMAN, a neural toolkit for translating from 20 languages into Modern Standard Arabic (MSA).☆58Apr 9, 2023Updated 3 years ago
- A fast python implementation of the SimHash algorithm.☆27Oct 27, 2021Updated 4 years ago
- Code and data related to "Efficient, Compositional, Order-Sensitive n-gram Embeddings" (EACL 2017)☆15Apr 6, 2017Updated 9 years ago
- Data Collection System For NLP/Speech Recognition☆25Apr 20, 2021Updated 5 years ago
- A flexible chatting tool like slack☆17Aug 21, 2021Updated 4 years ago
- Repository accompanying "An Open Dataset and Model for Language Identification" (Burchell et al., 2023)☆77Apr 1, 2025Updated last year
- Simple tool for generating tokens with open source transformers and/or calculate per-token surprisal.☆14Apr 15, 2026Updated 2 months ago
- [LREC 2024] 🖋 Resource and Tool for Writing System Identification☆22Mar 29, 2026Updated 3 months ago
- Hieroglyphs Everywhere fonts☆25Nov 28, 2021Updated 4 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Source stories from the African Storybook Project in Markdown format☆22Jan 25, 2026Updated 5 months ago
- OpusCleaner is a web interface that helps you select, clean and schedule your data for training machine translation models.☆58Feb 3, 2026Updated 5 months ago
- Morpha lex stemmer converted using jflex.☆24Oct 12, 2020Updated 5 years ago
- Benchmark Arabic text diacritization dataset☆78Apr 7, 2026Updated 2 months ago
- ☆14Jun 25, 2024Updated 2 years ago
- A library for data streaming and augmentation☆22May 5, 2025Updated last year
- Showcase how mxbai-embed-large-v1 can be used to produce binary embedding. Binary embeddings enabled 32x storage savings and 40x faster r…☆19Mar 23, 2024Updated 2 years ago
- scipts for working with open.bible data☆26Jan 24, 2022Updated 4 years ago
- [ACL 2023] Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages☆107Apr 14, 2026Updated 2 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A reordering tool for machine translation.☆15May 3, 2019Updated 7 years ago
- Translation of query languages to serialized KoralQuery protocol☆15Jun 4, 2026Updated last month
- Meedan's Open Source Arabic/English Translation Memory☆33Nov 4, 2009Updated 16 years ago
- ☆17Dec 11, 2024Updated last year
- ☆13Jul 25, 2024Updated last year
- datasets with text data for use in NLP, Text analysis, information extraction, ML research.☆16Feb 1, 2019Updated 7 years ago
- A python library for constructing, modifying and publishing scientific workflows described using semantic technologies.☆15Aug 6, 2025Updated 10 months ago