AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.
β35Feb 5, 2026Updated 3 weeks ago
Alternatives and similar repositories for afrolid
Users that are interested in afrolid are comparing it to the libraries listed below
Sorting:
- SERENGETI: Massively Multilingual Language Models for Africaβ17Oct 26, 2023Updated 2 years ago
- πΈ GlotWeb: Web Indexing for Minority Languages (WWW 2026)β17Updated this week
- COMET for African languagesβ10Jan 24, 2025Updated last year
- The easiest way to update static sites hosted on GitHub Pages with a visual editorβ11Mar 28, 2018Updated 7 years ago
- π’ Work with static vector modelsβ37Apr 21, 2025Updated 10 months ago
- Evaluate language models using multiple choice itemsβ13Updated this week
- Targetted language identifier, based on FastText and Hunspell.β38Sep 4, 2025Updated 6 months ago
- Statistics on multilingual datasetsβ17Jul 12, 2022Updated 3 years ago
- Repository accompanying "An Open Dataset and Model for Language Identification" (Burchell et al., 2023)β74Apr 1, 2025Updated 11 months ago
- Data Collection System For NLP/Speech Recognitionβ25Apr 20, 2021Updated 4 years ago
- Source stories from the African Storybook Project in Markdown formatβ22Jan 25, 2026Updated last month
- Bayesian Assessment of Hypothesesβ26Jul 6, 2023Updated 2 years ago
- OpusCleaner is a web interface that helps you select, clean and schedule your data for training machine translation models.β57Feb 3, 2026Updated last month
- Library for fast text representation and classification.β31Jan 9, 2024Updated 2 years ago
- A Directory of Online Newspaper Sources for 70+ Languagesβ31Apr 15, 2021Updated 4 years ago
- NTREX -- News Test References for MT Evaluationβ88Jun 5, 2024Updated last year
- Repository for React Fundamentals classroom demonstration contacts appβ11Nov 19, 2024Updated last year
- A parallel evaluation data set of SAP software documentation with document structure annotationβ14Jul 30, 2025Updated 7 months ago
- Machine translation (MT) benchmark dataset for languages in the Horn of Africa.β42Oct 13, 2022Updated 3 years ago
- Creating super-parallel corpora of more than 1500+ unique languages for NLP researchβ34Dec 8, 2022Updated 3 years ago
- Facebook post remake with HTML, CSS, and JS. No frameworks or dependencies. If you're a new web learner then you must explore this open rβ¦β10Jul 19, 2021Updated 4 years ago
- Utilities to gather software metrics from tools (SONAR, etc) and store them into ElasticSearch for later display using Kibana.β11Dec 31, 2017Updated 8 years ago
- Romanian Word Embeddings. Here you can find pre-trained corpora of word embeddings. Current methods: CBOW, Skip-Gram, Fast-Text (from Genβ¦β13Oct 6, 2025Updated 4 months ago
- Curated list of Moroccans publishing in the most prestigious AI conferencesβ10Oct 14, 2024Updated last year
- Curated list of awesome datasets for various table understanding tasksβ18Sep 5, 2025Updated 5 months ago
- Basis of FragDenStaat.de's βKoalitionstrackerββ15Jul 14, 2025Updated 7 months ago
- Auto-generated trivia questions based on DBPedia data.β15Feb 26, 2017Updated 9 years ago
- Token-free Language Modeling with ByGPT5 & Friends!β12Jul 18, 2025Updated 7 months ago
- Automatic subtitles in your videosβ11Mar 24, 2024Updated last year
- A repository for resources relating to NLP in the Balochi languageβ19Jun 3, 2023Updated 2 years ago
- A lightweight neural network library in javascriptβ12Dec 2, 2021Updated 4 years ago
- β38Apr 17, 2024Updated last year
- Residual Quantization Autoencoder, used for interpreting LLMsβ14Jan 1, 2025Updated last year
- The pipeline for the OSCAR corpusβ176Nov 9, 2025Updated 3 months ago
- Exercises for the CERN Openlab GPU lectureβ12Jul 22, 2025Updated 7 months ago
- Web archiving utility libraryβ11Dec 3, 2025Updated 3 months ago
- Moral Machine Experiment on LLMsβ11Feb 2, 2026Updated last month
- Persian Datasets including: Wikipedia, Twitter, Hamshahri, Hellokish, NSURL'19, Peyma, Text_mining.irβ11Oct 6, 2023Updated 2 years ago
- SMOR (Stuttgart Morphology) with alternative lemmatization componentβ13Aug 10, 2023Updated 2 years ago