Instruction dataset for Arabic with 10,000 instruction and output pairs. CIDAR can be used to fine-tune LLMs to follow instructions.
☆45Apr 3, 2025Updated 10 months ago
Alternatives and similar repositories for CIDAR
Users that are interested in CIDAR are comparing it to the libraries listed below
Sorting:
- Arabic Tokenization Library. It provides many tokenization algorithms.☆110Jan 4, 2024Updated 2 years ago
- Arabic poetry analysis and generation.☆23Jul 23, 2023Updated 2 years ago
- ArabicaQA: Comprehensive Dataset for Arabic Question Answering accepted at SIGIR 2024☆18Jul 28, 2024Updated last year
- Python intefrace for evaluation on chatgpt models☆19Feb 13, 2024Updated 2 years ago
- A PyTorch RNN model to generate Arabic poems☆17May 30, 2025Updated 9 months ago
- Aranizer: A Custom Tokenizer based on SentencePiece and BPE tailored for Arabic Language Modeling☆21Aug 4, 2024Updated last year
- Explore the content of Arabic text datasets.☆19May 23, 2022Updated 3 years ago
- ☆40Dec 25, 2022Updated 3 years ago
- A simple strategy for training and finetuning NLP models for Arabic. Specify the parameters and just wait for the results. A simple desig…☆22Jan 27, 2024Updated 2 years ago
- This repository contains copies of the major repositories that contain Arabic QA dataset that follows the SQuAD format☆12Sep 2, 2020Updated 5 years ago
- This is the official repository for Peacock: A Family of Arabic Multimodal Large Language Models and Benchmarks.☆26Dec 9, 2024Updated last year
- The largest public catalogue for Arabic NLP and speech datasets. There are +500 datasets annotated with more than 25 attributes.☆194Jan 30, 2026Updated last month
- A data preprocessor for the Quranic Treebank using neural networks. Divides longer verses into smaller chunks.☆12Jul 4, 2023Updated 2 years ago
- Code and models for "The Interplay of Variant, Size, and Task Type in Arabic Pre-trained Language Models". EACL 2021, WANLP.☆55Jun 21, 2024Updated last year
- Implementation of many Arabic NLP and CV projects. Providing real time experience using many interfaces like web, command line and notebo…☆422Mar 1, 2024Updated 2 years ago
- A neural and statistical engine for accurately adding diacritics (Tashkeel) to Arabic text. First-place winner on Kaggle 🥇☆18May 29, 2025Updated 9 months ago
- CAMeL Dataset☆15Apr 15, 2025Updated 10 months ago
- Arabic Art using GANs☆17Aug 3, 2022Updated 3 years ago
- A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.☆529Feb 11, 2026Updated 2 weeks ago
- Pre-trained Transformers for Arabic Language Understanding and Generation (Arabic BERT, Arabic GPT2, Arabic ELECTRA)☆711Oct 17, 2022Updated 3 years ago
- ☆40Apr 20, 2019Updated 6 years ago
- Code, models, and data for "Advancements in Arabic Grammatical Error Detection and Correction: An Empirical Investigation". EMNLP 2023.☆17Aug 29, 2024Updated last year
- This is a repository of the Multi-dialect Arabic BERT model.☆38Jul 14, 2020Updated 5 years ago
- A python package made to generate sequences (greedy and beam-search) from Pytorch (not necessarily HF transformers) models.☆18Dec 12, 2025Updated 2 months ago
- Arabic speech recognition, classification and text-to-speech.☆424Sep 30, 2023Updated 2 years ago
- Arabic Language Model based on Bert☆19Mar 22, 2020Updated 5 years ago
- ☆55Jul 21, 2024Updated last year
- pyarabic☆478Jan 16, 2026Updated last month
- This repository contains the Arabic sarcasm dataset (ArSarcasm)☆26Feb 18, 2021Updated 5 years ago
- ☆65Jul 10, 2025Updated 7 months ago
- TURJUMAN, a neural toolkit for translating from 20 languages into Modern Standard Arabic (MSA).☆56Apr 9, 2023Updated 2 years ago
- Python package for Arabic natural language processing☆28Jun 12, 2019Updated 6 years ago
- SnapDocs - A Modern, Open-Source Document Workspace☆24Sep 7, 2025Updated 5 months ago
- Seq2Seq-based open domain empathetic conversational model for Arabic: Dataset & Model☆59Feb 25, 2025Updated last year
- Tashaphyne: Arabic Light Stemmer☆103Sep 2, 2024Updated last year
- Arabic Open Domain Question Answering System using Neural Reading Comprehension☆165Aug 4, 2023Updated 2 years ago
- Ghalatawi: Arabic Autocorrect library☆30May 3, 2024Updated last year
- ☆25Nov 13, 2022Updated 3 years ago
- UBC ARBERT and MARBERT Deep Bidirectional Transformers for Arabic☆114Sep 2, 2021Updated 4 years ago