zeerakahmed / makhzan
An Urdu text corpus
☆70Updated last year
Alternatives and similar repositories for makhzan:
Users that are interested in makhzan are comparing it to the libraries listed below
- 📖 A curated list of resources dedicated to Urdu language.☆62Updated 3 years ago
- An NLP library for the Urdu language. It comes with a lot of battery included features to help you process Urdu data in the easiest way p…☆286Updated last year
- Collection of Urdu datasets for POS, NER, Sentiment, Summarization and NLP tasks.☆72Updated 5 months ago
- 📄 Complete collection of Urdu language characters & unicode code points.☆39Updated last year
- 📝A text file containing 150,000 Urdu words for all your dictionary/word-based projects e.g: auto-completion / autosuggestion.☆44Updated 4 years ago
- Compilation of Manually Tagged Roman Urdu Dataset (Urdu written in Latin/Roman Script), along with other helpful Roman Urdu NLP resources☆31Updated 4 years ago
- Dataset for Urdu Ghazals☆13Updated last year
- The Dakshina dataset is a collection of text in both Latin and native scripts for 12 South Asian languages. For each language, the datase…☆192Updated 4 years ago
- A LaTeX package to typeset nepali latex documents using lualatex.☆49Updated 8 months ago
- State of the art open-source translation for Indic languages.☆5Updated 3 years ago
- ☆14Updated 4 years ago
- A collaborative catalog of NLP resources for Indic languages☆567Updated last month
- A list of data sources for Nepal related data.☆28Updated 8 years ago
- Pre-processing and training scripts for the Tarteel Dataset☆193Updated 3 years ago
- Awesome List of Tamil NLP & AI Resources☆105Updated last year
- Python package for indic script transliteration☆170Updated 2 weeks ago
- Codebase for Indic-Transliteration using Seq2Seq RNN. For latest repo with Transformer-based models, check: https://github.com/AI4Bharat/…☆60Updated 3 years ago
- Quran, Hadith, Translations, Tafaseer, Corpus Linguistics. Everything for NLP☆69Updated 9 months ago
- A repository for publicly/freely available Natural Language Processing (NLP) datasets for African languages.☆98Updated 8 months ago
- Description Describes the IndicNLP corpus and associated datasets☆158Updated last year
- The project aims on adding a state-of-the-art transliteration module for cross transliterations among all Indian languages including Engl…☆261Updated 2 years ago
- Shami Dialect Corpus (SDC)☆25Updated 6 years ago
- Machine Learning datasets for Nepal☆185Updated last year
- Our submission for quran QA shared-task. Fortunately, this work achieved the first place among accepted papers.☆18Updated 2 weeks ago
- Xlit-Crowd: Hindi-English Transliteration Corpus☆37Updated 9 years ago
- Madina OpenType variable font☆16Updated this week
- State of the Art Language models and Classifier for Kannada, which is spoken predominantly by Kannada people in India, mainly in the stat…☆31Updated 4 years ago
- ThamizhiMorph: A Tamil Morphological Analyser and Generator☆16Updated last year
- தமிழில் இயல்மொழி ஆய்வுக்கான நிர ல்கள், கருவிகள் மற்றும் தரவுகள்☆71Updated last week