A character-wise tokenizer for morphologically rich languages
☆31Sep 28, 2025Updated 8 months ago
Alternatives and similar repositories for RFTokenizer
Users that are interested in RFTokenizer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An NLP pipeline for Hebrew☆41Jun 16, 2025Updated last year
- Named Entity (NER) annotations of the Hebrew Treebank (Haaretz newspaper) corpus, including: morpheme and token level NER labels, nested …☆11Dec 27, 2021Updated 4 years ago
- A simple configurable tool for manipulating dependency trees.☆14Dec 25, 2024Updated last year
- Arabic flexionnal morphology generator☆35Aug 28, 2024Updated last year
- A fork of languagetool to maintain Arabic☆18Mar 22, 2025Updated last year
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Repository for DISRPT2023 shared task☆17Jul 26, 2024Updated last year
- Public repository for Coptic SCRIPTORIUM Corpora Releases☆45Jun 10, 2026Updated last week
- Scripts for compatibilitising between VISL-CG3, Apertium, CoNLL-X and Universal Dependencies☆17Mar 4, 2020Updated 6 years ago
- Dataset of the Samaritan Pentateuch☆13May 13, 2026Updated last month
- Fast corpus search engine originally made for the Corpus of Written Tatar language☆17Nov 9, 2019Updated 6 years ago
- Repository for DISRPT2019 shared task☆12Sep 5, 2022Updated 3 years ago
- A very simple python tokenizer for Hebrew text.☆26Nov 13, 2021Updated 4 years ago
- Tools for splitting, normalizing, text-shaping Arabic script☆12Jun 23, 2024Updated last year
- Arabic named entity recognition using AnerCorp corpus (location , organisation, person, Miscellaneous Word)☆37Jul 28, 2017Updated 8 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Pure python, embedded, fast, schema-less, NoSQL database☆12Aug 1, 2020Updated 5 years ago
- sentiment analysis models for Arabic tweets to analyze Twitter comments as having positive, negative or neutral sentiments.☆13Mar 17, 2018Updated 8 years ago
- Yaziji : Arabic phrase generator☆17Jan 2, 2025Updated last year
- A memory-based morphological parser for Python☆16Oct 12, 2012Updated 13 years ago
- Repository for rstWeb, a browser based annotation interface for Rhetorical Structure Theory☆47Aug 15, 2025Updated 10 months ago
- eXternally configurable REference and Non Named Entity Recognizer☆17Jun 17, 2024Updated 2 years ago
- Dead Sea Scrolls in TF format based on Abegg's data☆31Apr 22, 2026Updated last month
- مكتبة جافاسكريبت تقوم باستبدال الأحرف اللاتنية عند الكتابة بأحرف عربية (والعكس) مع واجهة برمجة مرنة☆41Oct 22, 2019Updated 6 years ago
- The complete [1 to 5]-gram Gumar Corpus in the style of Google n-grams.☆12Feb 5, 2020Updated 6 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- A field-tested Hebrew tokenizer for dirty texts (ben-yehuda project, bible, cc100, mc4, opensubs, oscar, twitter) focused on multi-word e…☆23Aug 13, 2022Updated 3 years ago
- MetaC provides a read-eval-print loop (a REPL) and notebook interactive development environment (a NIDE) for C programming. MetaC also …☆12Mar 29, 2026Updated 2 months ago
- Repository for the Georgetown University Multilayer Corpus (GUM)☆109Jun 8, 2026Updated last week
- Firefox and Chrome compatible extension that acts as annotation tool for websites (Named Entity Recognition)☆10Feb 17, 2019Updated 7 years ago
- Do you even science, bro? Using RNN's to predict scientific titles.☆14Jun 5, 2017Updated 9 years ago
- Ya (ي) programming language is an open-source programming language where you can write python code in the Arabic language.☆43Jan 31, 2019Updated 7 years ago
- ORSH - is an Oranios simple shell written in order to understand how shells work .☆12Jun 12, 2024Updated 2 years ago
- AlephBertGimmel - Modern Hebrew pretrained BERT model with a 128K token vocabulary.☆25Dec 1, 2022Updated 3 years ago
- Support for linguistics-style examples in Org mode☆10Dec 9, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Convert Transkribus PAGE-XML to standard PAGE-XML☆12Dec 10, 2025Updated 6 months ago
- ☆13Dec 28, 2022Updated 3 years ago
- ☆30Feb 1, 2020Updated 6 years ago
- Social Context Analysis aNd Emotion Recognition☆12Jul 11, 2017Updated 8 years ago
- collection of code for helping me get things done☆16Feb 21, 2022Updated 4 years ago
- This dataset contains naturally-occurring English sentences that feature non-trivial noun-verb ambiguity.☆38Apr 26, 2019Updated 7 years ago
- Character-level conversion between Hebrew text and Latin transliteration using deep learning - a demonstration of seq2seq training.☆15Jun 27, 2023Updated 2 years ago