A fast, simple, multilingual tokenizer
☆29May 24, 2017Updated 9 years ago
Alternatives and similar repositories for tok-tok
Users that are interested in tok-tok are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Command-line corpus tools☆12May 15, 2017Updated 9 years ago
- Persian stemmer and morphological analyzer☆19Mar 30, 2016Updated 10 years ago
- Fast Word Clustering Software☆79Feb 8, 2025Updated last year
- Easy language identification of 380 languages☆17Dec 2, 2019Updated 6 years ago
- MIZAN: a large persian-english parallel corpus☆29Sep 15, 2020Updated 5 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- 🤗 ParsBERT Persian NER Tasks☆18Jun 17, 2021Updated 5 years ago
- Accompanying code for our EMNLP 2017 publication "GraphDocExplore: A Framework for the Experimental Comparison of Graph-based Document Ex…☆27May 27, 2023Updated 3 years ago
- List of text corpora (text dataset in Persian) that we used in FarsiYar text-mining tools☆18Jul 16, 2019Updated 6 years ago
- A repository for the "Combining DBpedia and Topic Modeling" GSoC 2016 idea☆13Sep 1, 2016Updated 9 years ago
- Literate programming for any language. It's 🔥.☆17Jan 18, 2019Updated 7 years ago
- Stochastic poetry generation, using a trigram backoff model.☆31Mar 20, 2015Updated 11 years ago
- Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipg…☆130Feb 5, 2026Updated 4 months ago
- The Importance of Being Recurrent for Modeling Hierarchical Structure☆25Jun 27, 2018Updated 7 years ago
- Python package to augment multilingual data☆15Feb 15, 2023Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Demo page of our paper Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks With Guided Attention, ICASSP 201…☆15May 30, 2021Updated 5 years ago
- Use spaCy for NLP and output to the FoLiA XML format.☆12Feb 27, 2024Updated 2 years ago
- Python evaluation scripts for AIDA-formatted CoNLL data☆20Aug 4, 2014Updated 11 years ago
- Tools for extracting parallel corpora from article titles across languages in Wikipedia☆74Feb 25, 2015Updated 11 years ago
- maximum entropy based part-of-speech tagger for NLTK☆45Dec 8, 2016Updated 9 years ago
- مشهدلاگ، اجتماع آزادی از دوستداران گنو/لینوکس است که بیشتر در شهر مشهد زندگی میکنند. ما به گرمی به کسانی که تمایل دارند به اجتماع ما ب…☆10Oct 24, 2025Updated 7 months ago
- Persian for LaTeX, using XeTeX☆11May 13, 2020Updated 6 years ago
- tools to analyze a collection of texts and identify relevant words☆12May 20, 2018Updated 8 years ago
- ChatGPT plugin for Singapore HDB car park availability☆19Jun 7, 2023Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Beheshti-NER: Persian named entity recognition Using BERT☆14May 16, 2021Updated 5 years ago
- ☆10Jun 18, 2021Updated 5 years ago
- Android library providing components for capturing, reviewing and analyzing photos of invoices and remittance slips.☆11Jun 7, 2022Updated 4 years ago
- Code samples from my blog, https://poanchen.github.io/blog/☆12Oct 28, 2020Updated 5 years ago
- Fine-grained sentiment annotations of NoReC☆20Aug 1, 2022Updated 3 years ago
- Pyramidal Recurrent Units (PRUs): A New LSTM Unit☆10Aug 29, 2018Updated 7 years ago
- Analyzing Uncertainty in Neural Machine Translation☆36Sep 15, 2021Updated 4 years ago
- NDSLICE wrapper for LAPACK☆12Dec 19, 2023Updated 2 years ago
- A tutorial about DBpedia and Linked Data in general☆24Nov 7, 2014Updated 11 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A chronological (up to the century in which the poet has lived) of Persian poetry, extracted from the brilliant Ganjoor database☆18Jan 31, 2021Updated 5 years ago
- Persian text -> integer, ineteger -> text converter☆13Nov 14, 2020Updated 5 years ago
- A list of Neural MT implementations☆364Jul 27, 2022Updated 3 years ago
- An ImageMagick binding for the D Programming Language.☆17Sep 6, 2020Updated 5 years ago
- Repository for "Towards Robust Named Entity Recognition for Historic German"☆18Dec 11, 2020Updated 5 years ago
- Repository for ACL 2019 paper☆75Jun 30, 2019Updated 6 years ago
- BlackboxNLP 2019: Analyzing and interpreting neural networks for NLP☆18Aug 1, 2019Updated 6 years ago