AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.
☆36Feb 5, 2026Updated last month
Alternatives and similar repositories for afrolid
Users that are interested in afrolid are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- SERENGETI: Massively Multilingual Language Models for Africa☆17Oct 26, 2023Updated 2 years ago
- 🕸 GlotWeb: Web Indexing for Minority Languages (WWW 2026)☆17Feb 27, 2026Updated 3 weeks ago
- COMET for African languages☆11Jan 24, 2025Updated last year
- Bayesian Assessment of Hypotheses☆26Jul 6, 2023Updated 2 years ago
- 🔢 Work with static vector models☆38Apr 21, 2025Updated 11 months ago
- Neural Machine Translation for South African Languages☆40Dec 8, 2022Updated 3 years ago
- Evaluate language models using multiple choice items☆13Mar 6, 2026Updated 2 weeks ago
- Auto-generated trivia questions based on DBPedia data.☆15Feb 26, 2017Updated 9 years ago
- The easiest way to update static sites hosted on GitHub Pages with a visual editor☆11Mar 28, 2018Updated 7 years ago
- ☆12Mar 7, 2022Updated 4 years ago
- Library for fast text representation and classification.☆31Jan 9, 2024Updated 2 years ago
- ☆12Jan 2, 2024Updated 2 years ago
- English-Myanmar dictionary data☆14Aug 23, 2016Updated 9 years ago
- ☆10May 11, 2024Updated last year
- NTREX -- News Test References for MT Evaluation☆88Jun 5, 2024Updated last year
- Semantically Search Emojis From the Command Line!☆13Nov 26, 2023Updated 2 years ago
- LLM-only topic extraction and classification☆11Sep 20, 2024Updated last year
- TURJUMAN, a neural toolkit for translating from 20 languages into Modern Standard Arabic (MSA).☆56Apr 9, 2023Updated 2 years ago
- A framework for overviewing the performance of F0 estimators☆19Sep 10, 2016Updated 9 years ago
- A fast python implementation of the SimHash algorithm.☆27Oct 27, 2021Updated 4 years ago
- 🖋 Resource and Tool for Writing System Identification (Unicode 17.0) -- LREC 2024☆21Feb 17, 2026Updated last month
- Source stories from the African Storybook Project in Markdown format☆22Jan 25, 2026Updated 2 months ago
- OpusCleaner is a web interface that helps you select, clean and schedule your data for training machine translation models.☆58Feb 3, 2026Updated last month
- Generating text from RDF data with sequence to sequence models☆12Jul 25, 2018Updated 7 years ago
- A model for unsupervised morphological analysis that integrates orthographic and semantic views of words.☆13Oct 10, 2023Updated 2 years ago
- A Directory of Online Newspaper Sources for 70+ Languages☆31Apr 15, 2021Updated 4 years ago
- Script to convert all MP4 videos in a zip archive to JPG frames at a desired FPS with unique names. It will then retrain the top layers o…☆12Jul 6, 2016Updated 9 years ago
- Experimenting with Hierarchical Attention Networks from https://arxiv.org/abs/1606.02393 in Keras☆13Oct 12, 2016Updated 9 years ago
- A library for data streaming and augmentation☆21May 5, 2025Updated 10 months ago
- Showcase how mxbai-embed-large-v1 can be used to produce binary embedding. Binary embeddings enabled 32x storage savings and 40x faster r…☆19Mar 23, 2024Updated 2 years ago
- ☆20Jul 22, 2022Updated 3 years ago
- Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023☆106Apr 20, 2024Updated last year
- This is an analytical project done using python to process and extract valuable insights from WhatsApp text file, deployed as a webapp us…☆19Dec 8, 2023Updated 2 years ago
- ☆17Dec 11, 2024Updated last year
- Assignments for AML course @ UvA. Fall 2017☆14Nov 22, 2017Updated 8 years ago
- ☆13Jul 25, 2024Updated last year
- datasets with text data for use in NLP, Text analysis, information extraction, ML research.☆16Feb 1, 2019Updated 7 years ago
- Benchmark scripts for comparing different tokenizers and sentence segmenters of German☆12Feb 27, 2023Updated 3 years ago
- Source code for ASRU 2019 paper "Adapting Pretrained Transformer to Lattices for Spoken Language Understanding"☆10Jul 8, 2020Updated 5 years ago