A small python script that transliterates Arabic text using the Buckwalter Transliteration Scheme. It allows for multiple decisions to be made around whether or not to include all types of diacritics and characters or ignore them. Useful for NLP experiments where you may want to normalize text.
☆26Apr 3, 2014Updated 11 years ago
Alternatives and similar repositories for Buckwalter
Users that are interested in Buckwalter are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Arabic vocalized text corpus☆14Jan 2, 2015Updated 11 years ago
- Comparable documents miner: Arabic-English morphological analysis, text processing, n-gram features extraction, POS tagging, dictionary t…☆35Apr 24, 2017Updated 8 years ago
- All resources created and used in Arabic Sentiment Analysis of Arabic Tweets. Includes Sentiment lexicon generated from Arabic tweets and…☆14Dec 21, 2021Updated 4 years ago
- A ruby gem that contains Natural Language Processing tools for Arabic.☆11May 11, 2015Updated 10 years ago
- Arabic flexionnal morphology generator☆35Aug 28, 2024Updated last year
- Convert Arabic diacritised text to a sequence of phonemes and create a pronunciation dictionary from them for alignment using HTK☆63Jun 14, 2017Updated 8 years ago
- ☆30Feb 1, 2020Updated 6 years ago
- A desktop version of Edward Lane's Arabic-English Lexicon☆21Apr 21, 2018Updated 7 years ago
- repository for the project of building large arabic multidomain lexicon for sentiment analysis using feature selection from multiple reso…☆16Jan 21, 2015Updated 11 years ago
- Python (Cython) binding for harfbuzz an OpenType text shaping.☆19Aug 24, 2018Updated 7 years ago
- TEAD : Large Scale Arabic Dataset for Sentiment Analysis☆12Oct 16, 2018Updated 7 years ago
- Experimenting with Sentiment Analysis in Arabic☆10Aug 31, 2014Updated 11 years ago
- Dictionary app that allows you to look up Arabic words in transliteration☆62Feb 17, 2026Updated last month
- ElixirFM Functional Arabic Morphology☆45Mar 15, 2023Updated 3 years ago
- Arabic Phonetic Dictionary Generator Tool for Automatic Speech Recognition Applications☆12Oct 27, 2021Updated 4 years ago
- Nile University's Arabic sentiment Lexicon☆17Nov 24, 2016Updated 9 years ago
- Arabic Dialect Identification on AOC data.☆24Mar 2, 2019Updated 7 years ago
- Jabalín is an application for generating verbs in Modern Standard Arabic. The application is implemented in python language version 3. Th…☆12Jul 12, 2015Updated 10 years ago
- YaraSpell is an simplified arabic spell checker☆46Feb 20, 2017Updated 9 years ago
- WAZEN is an Arabic NLP text utility to find word variation pattern.☆15Sep 18, 2021Updated 4 years ago
- This buckwalter2unicode script is designed to convert Arabic text that has been transliterated to ASCII symbols using the Buckwalter Tran…☆13Sep 30, 2012Updated 13 years ago
- Benchmark Arabic text diacritization dataset☆77Jul 26, 2019Updated 6 years ago
- The Arabic NLP Python Library (Archived in favor of Matn library)☆11Apr 28, 2017Updated 8 years ago
- ☆12May 21, 2020Updated 5 years ago
- Extract dates from text☆66Jan 27, 2021Updated 5 years ago
- Arabic support for textblob☆86Oct 21, 2021Updated 4 years ago
- Python transliteration library (mostly from non-latin scripts, such as Arabic, Japanese, etc.)☆20Dec 31, 2018Updated 7 years ago
- Arabic NLP tool used to perform Text Search, POS tagging, Translation, auto-diacritization, etc..☆90Feb 7, 2021Updated 5 years ago
- JavaScript Arabic Stemmer☆26Dec 1, 2012Updated 13 years ago
- This repository☆30Nov 13, 2022Updated 3 years ago
- Arabic Dialectal Offensive Language dataset from social media comments on news post from facebook, twitter and youtube platforms☆18Sep 25, 2020Updated 5 years ago
- This is a diacritization model for Arabic language. This model was built/trained using the Tashkeela: the Arabic diacritization corpus on…☆45Sep 10, 2023Updated 2 years ago
- Mono-width companion to Amiri font family☆31Jul 29, 2025Updated 7 months ago
- The first Dialectal Arabic Code Switching - DACS corpus from broadcast speech. Annotated at the token-level, considering both the linguis…☆15Apr 3, 2022Updated 3 years ago
- Contextualised Word Representations for Lexical Semantic Change Analysis☆32Jul 17, 2020Updated 5 years ago
- Pronounce Arabic words☆19May 27, 2019Updated 6 years ago
- Arabic Parser Using Stanford API☆12Nov 11, 2017Updated 8 years ago
- A Javascript library that extends the native String object with methods to help when dealing with Arabic strings for node and the browser…☆56Sep 12, 2018Updated 7 years ago
- Youtube comments topics modeling and sentiment analyzer☆16Oct 25, 2022Updated 3 years ago