Aranizer: A Custom Tokenizer based on SentencePiece and BPE tailored for Arabic Language Modeling
☆22Aug 4, 2024Updated last year
Alternatives and similar repositories for aranizer
Users that are interested in aranizer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Arabic News Stance Corpus☆11Feb 5, 2021Updated 5 years ago
- A comprehensive list of Arabic NLP resources.☆46Sep 7, 2025Updated 7 months ago
- ☆56Jul 21, 2024Updated last year
- Intuitive graphical representation of source code☆14Mar 15, 2023Updated 3 years ago
- Exploratory Data Analysis in Scala☆11Sep 25, 2020Updated 5 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Scripts to finetune the official implementation of OpenAI's Whisper model☆25Apr 14, 2026Updated 3 weeks ago
- A simple semi-supervised approach for creating huggingface data script loaders and upload to the hub.☆11Jun 23, 2024Updated last year
- UBC ARBERT and MARBERT Deep Bidirectional Transformers for Arabic☆117Sep 2, 2021Updated 4 years ago
- Instruction dataset for Arabic with 10,000 instruction and output pairs. CIDAR can be used to fine-tune LLMs to follow instructions.☆46Apr 3, 2025Updated last year
- Get Tunisian translation, audio and sample sentence for the most common 20.000 english word☆13Jan 20, 2024Updated 2 years ago
- ☆39Feb 1, 2025Updated last year
- A zero-config OpenAI client with support for 20+ providers, API key rotation, rate limits, optional LangChain integration and more.☆19Dec 11, 2025Updated 4 months ago
- ☆10Feb 2, 2024Updated 2 years ago
- LOW-RESOURCE NEURAL MACHINE TRANSLATION: A BENCHMARK FOR FIVE AFRICAN LANGUAGES☆16Jul 27, 2020Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Seamlessly integrate IoT data with AI agents, enabling the effortless parsing, processing, and utilization of IoT data streams.☆11Jan 27, 2025Updated last year
- ☆16Jun 28, 2025Updated 10 months ago
- ArSarcasm-v2 is an extension to the original ArSarcasm dataset. It was used for the shared task on sarcasm detection and sentiment analys…☆12Jan 26, 2022Updated 4 years ago
- Python intefrace for evaluation on chatgpt models☆19Feb 13, 2024Updated 2 years ago
- ☆12Jun 6, 2020Updated 5 years ago
- This code belongs to ACL conference paper entitled as "An Online Semantic-enhanced Dirichlet Model for Short Text Stream Clustering"☆17Apr 22, 2021Updated 5 years ago
- Building a verification system based on the finger vein image is a hard problem since it goes through many steps including vein detection…☆10Jan 6, 2020Updated 6 years ago
- ☆18Mar 12, 2024Updated 2 years ago
- One of the problems faced concerning Arabic fake news detection is the scarcity of Arabic datasets. We believe it is important to availab…☆11Jun 13, 2022Updated 3 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Arabic edition of ALBERT pretrained language models☆16Apr 25, 2021Updated 5 years ago
- ☆12Dec 29, 2016Updated 9 years ago
- The official submission from Speech Squad team for the MTC-AIC 2 competition of 2024 where an ASR model is developed tailored for the Egy…☆18Mar 9, 2026Updated last month
- Arabic Text to Speech☆18Jun 3, 2015Updated 10 years ago
- A neural and statistical engine for accurately adding diacritics (Tashkeel) to Arabic text. First-place winner on Kaggle 🥇☆18May 29, 2025Updated 11 months ago
- The system enables sophisticated coordination of multiple drones through natural language commands, visual inputs, and real-time environm…☆16Dec 15, 2025Updated 4 months ago
- Deep Visual Speech Recognition in arabic words☆16Oct 18, 2023Updated 2 years ago
- ☆20May 25, 2024Updated last year
- The dataset for the paper "Machamp: A Generalized Entity Matching Benchmark" published in CIKM 2021☆21Oct 18, 2021Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Communication Relay by creating a WiFi Mesh Network using ROS, and using that network for Data Telemetry, with Telemetry radios ( Ubiquit…☆11Dec 18, 2018Updated 7 years ago
- Simple GUI to batch create DVHs from multiple dicom-RT dose files and a single Dicom-RT contour file☆12Aug 5, 2021Updated 4 years ago
- An app that extends the bluetooth comms☆11May 31, 2023Updated 2 years ago
- ☆11Sep 27, 2024Updated last year
- This is the RobEn AI's team home made Discord bot. Custom made for the AI team Discord server, to serve.☆16May 26, 2021Updated 4 years ago
- OdooSense is an AI tool that optimizes business processes in Odoo ERP. It's a virtual AI boss providing real-time data-driven insights an…☆15Apr 26, 2023Updated 3 years ago
- ☆11Apr 26, 2023Updated 3 years ago