Aranizer: A Custom Tokenizer based on SentencePiece and BPE tailored for Arabic Language Modeling
☆21Aug 4, 2024Updated last year
Alternatives and similar repositories for aranizer
Users that are interested in aranizer are comparing it to the libraries listed below
Sorting:
- Arabic News Stance Corpus☆11Feb 5, 2021Updated 5 years ago
- ArSarcasm-v2 is an extension to the original ArSarcasm dataset. It was used for the shared task on sarcasm detection and sentiment analys…☆12Jan 26, 2022Updated 4 years ago
- This code belongs to ACL conference paper entitled as "An Online Semantic-enhanced Dirichlet Model for Short Text Stream Clustering"☆17Apr 22, 2021Updated 4 years ago
- ☆12Jun 6, 2020Updated 5 years ago
- Arabic edition of ALBERT pretrained language models☆16Apr 25, 2021Updated 4 years ago
- Instruction dataset for Arabic with 10,000 instruction and output pairs. CIDAR can be used to fine-tune LLMs to follow instructions.☆45Apr 3, 2025Updated 11 months ago
- Python intefrace for evaluation on chatgpt models☆19Feb 13, 2024Updated 2 years ago
- UBC ARBERT and MARBERT Deep Bidirectional Transformers for Arabic☆114Sep 2, 2021Updated 4 years ago
- This is a repository of the Multi-dialect Arabic BERT model.☆38Jul 14, 2020Updated 5 years ago
- أسئلة باللغة العربية تركز على الثقافة السعودية تم اختبارها على عدد من النماذج اللغوية الضخمة LLMs☆17Jan 22, 2025Updated last year
- ☆55Jul 21, 2024Updated last year
- Code for ACL 2021 paper: Accelerating BERT Inference for Sequence Labeling via Early-Exit☆28Aug 19, 2022Updated 3 years ago
- ☆10Nov 27, 2023Updated 2 years ago
- End to end Arabic TTS system based on tacotron☆126Apr 5, 2024Updated last year
- ☆39Feb 1, 2025Updated last year
- مستودع الأوراق المسحية في معالجة اللغة العربية (أسبر) A Repository for survey and review papers in Arabic Natural Language processing (AN…☆85Feb 22, 2026Updated last week
- This is the second part of the Deep Learning Course for the Master in High-Performance Computing (SISSA/ICTP).)☆33Sep 15, 2020Updated 5 years ago
- ☆35Dec 14, 2023Updated 2 years ago
- Official code for PLoP☆17Jun 30, 2025Updated 8 months ago
- One of the problems faced concerning Arabic fake news detection is the scarcity of Arabic datasets. We believe it is important to availab…☆10Jun 13, 2022Updated 3 years ago
- Conversion of audio files to text using whisper from OpenAI with a simple tkinter GUI☆10Apr 13, 2023Updated 2 years ago
- LLM Building Blocks for Python Course☆16Nov 17, 2025Updated 3 months ago
- ☆11Sep 27, 2024Updated last year
- الذكاء الاصطناعي التوليدي باللغة العربية☆38Aug 7, 2024Updated last year
- ☆36Jul 16, 2021Updated 4 years ago
- ☆18Jun 25, 2025Updated 8 months ago
- Sakhi, a mobile-first app tailored for women, encompasses daily journals, safety features, community, and holistic health tools. Elevate …☆11Mar 7, 2024Updated 2 years ago
- Original VinVL visual backbone with simplified APIs to easily extract features, boxes, object detections, in a few lines of Python code.☆11Nov 27, 2022Updated 3 years ago
- AI model for making mazes that extends OpenAIs GPT2 model☆15Dec 21, 2023Updated 2 years ago
- a blog starter project☆11Oct 29, 2018Updated 7 years ago
- Code for Relevance-guided Supervision for OpenQA with ColBERT (TACL'21)☆40Aug 2, 2021Updated 4 years ago
- ☆40Dec 25, 2022Updated 3 years ago
- AIN - The First Arabic Inclusive Large Multimodal Model. It is a versatile bilingual LMM excelling in visual and contextual understanding…☆51Mar 13, 2025Updated 11 months ago
- A comprehensive list of Arabic NLP resources.☆44Sep 7, 2025Updated 5 months ago
- News clustering algorithm. Implementation of the "Multilingual Clustering of Streaming News" paper submitted to EMNLP 2018☆38May 2, 2022Updated 3 years ago
- Character-level Recurrent Neural Network Language Model (rnnlm) implement in Pytorch.☆12Oct 4, 2020Updated 5 years ago
- Deep Visual Speech Recognition in arabic words☆16Oct 18, 2023Updated 2 years ago
- ☆11Jul 19, 2018Updated 7 years ago
- Combining encoder-based language models☆11Nov 11, 2021Updated 4 years ago