Aranizer: A Custom Tokenizer based on SentencePiece and BPE tailored for Arabic Language Modeling
☆22Aug 4, 2024Updated last year
Alternatives and similar repositories for aranizer
Users that are interested in aranizer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Arabic News Stance Corpus☆11Feb 5, 2021Updated 5 years ago
- A comprehensive list of Arabic NLP resources.☆46Sep 7, 2025Updated 6 months ago
- Intuitive graphical representation of source code☆14Mar 15, 2023Updated 3 years ago
- Scripts to finetune the official implementation of OpenAI's Whisper model☆24Jul 6, 2025Updated 8 months ago
- A simple semi-supervised approach for creating huggingface data script loaders and upload to the hub.☆11Jun 23, 2024Updated last year
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- End-to-End Arabic ASR using DeepSpeech engine☆14Nov 2, 2021Updated 4 years ago
- UBC ARBERT and MARBERT Deep Bidirectional Transformers for Arabic☆114Sep 2, 2021Updated 4 years ago
- Instruction dataset for Arabic with 10,000 instruction and output pairs. CIDAR can be used to fine-tune LLMs to follow instructions.☆46Apr 3, 2025Updated 11 months ago
- Comprehensive list of resources for automated processing of Tunisian dialect text.☆19Mar 15, 2024Updated 2 years ago
- ☆10Feb 2, 2024Updated 2 years ago
- LOW-RESOURCE NEURAL MACHINE TRANSLATION: A BENCHMARK FOR FIVE AFRICAN LANGUAGES☆16Jul 27, 2020Updated 5 years ago
- Seamlessly integrate IoT data with AI agents, enabling the effortless parsing, processing, and utilization of IoT data streams.☆11Jan 27, 2025Updated last year
- ArSarcasm-v2 is an extension to the original ArSarcasm dataset. It was used for the shared task on sarcasm detection and sentiment analys…☆12Jan 26, 2022Updated 4 years ago
- Files needed to build Linux images for the Fydetab Duo☆16Feb 25, 2026Updated last month
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆12Jun 6, 2020Updated 5 years ago
- This code belongs to ACL conference paper entitled as "An Online Semantic-enhanced Dirichlet Model for Short Text Stream Clustering"☆17Apr 22, 2021Updated 4 years ago
- One of the problems faced concerning Arabic fake news detection is the scarcity of Arabic datasets. We believe it is important to availab…☆10Jun 13, 2022Updated 3 years ago
- Arabic edition of ALBERT pretrained language models☆16Apr 25, 2021Updated 4 years ago
- Arabic Text to Speech☆18Jun 3, 2015Updated 10 years ago
- A neural and statistical engine for accurately adding diacritics (Tashkeel) to Arabic text. First-place winner on Kaggle 🥇☆18May 29, 2025Updated 9 months ago
- The system enables sophisticated coordination of multiple drones through natural language commands, visual inputs, and real-time environm…☆16Dec 15, 2025Updated 3 months ago
- Traditional operating systems are reactive - they wait for user input or system events before taking action. SwarmOS breaks this paradigm…☆15Dec 6, 2024Updated last year
- The dataset for the paper "Machamp: A Generalized Entity Matching Benchmark" published in CIKM 2021☆21Oct 18, 2021Updated 4 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Multi-threading, Concurrency, Asynchrony, and various Execution Methods implemented in a Rust backend for bleeding edge performance.☆20Nov 11, 2024Updated last year
- Simple GUI to batch create DVHs from multiple dicom-RT dose files and a single Dicom-RT contour file☆12Aug 5, 2021Updated 4 years ago
- An app that extends the bluetooth comms☆11May 31, 2023Updated 2 years ago
- ☆11Sep 27, 2024Updated last year
- OdooSense is an AI tool that optimizes business processes in Odoo ERP. It's a virtual AI boss providing real-time data-driven insights an…☆15Apr 26, 2023Updated 2 years ago
- This comprehensive course teaches students how to build, deploy, and manage autonomous agents for enterprise workflows using the Swarms l…☆17Dec 22, 2025Updated 3 months ago
- Resk is a robust Python library designed to enhance security and manage context when interacting with LLMs. It provides a protective …☆16Dec 19, 2025Updated 3 months ago
- ☆11Apr 26, 2023Updated 2 years ago
- أسئلة باللغة العربية تركز على الثقافة السعودية تم اختبارها على عدد من النماذج اللغوية الضخمة LLMs☆18Jan 22, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Using reinforcement learning to minimize fuel consuption when landing a rover on Mars☆12Mar 21, 2022Updated 4 years ago
- Implemention of DVH prediction from the (contoured) anatomical scans ...☆11Jun 20, 2016Updated 9 years ago
- AIN - The First Arabic Inclusive Large Multimodal Model. It is a versatile bilingual LMM excelling in visual and contextual understanding…☆52Mar 13, 2025Updated last year
- This app is a remote controller in order to control a pixhawk based drone via LTE☆11Jan 4, 2025Updated last year
- The official implementation of CATT Arabic diacritization models.☆67Jul 18, 2025Updated 8 months ago
- CAMeL Dataset☆15Apr 15, 2025Updated 11 months ago
- Official code for PLoP☆17Mar 6, 2026Updated 2 weeks ago