This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet it is not always as trivial a task as it appears to be. This binding makes the power of the ucto tokeniser available to Python. Ucto itself is regular-expression based, extensible, and advanced tokeniser writt…
☆31Feb 2, 2026Updated 3 weeks ago
Alternatives and similar repositories for python-ucto
Users that are interested in python-ucto are comparing it to the libraries listed below
Sorting:
- python-timbl, originally developed by Sander Canisius, is a Python extension module wrapping the full TiMBL C++ programming interface. Wi…☆18May 2, 2025Updated 9 months ago
- Python bindings to the dutch NLP tool Frog (pos tagger, lemmatiser, NER tagger, morphological analysis, shallow parser, dependency parser…☆49Feb 2, 2026Updated 3 weeks ago
- A multi-level marketing web application -- matrix type☆16Aug 20, 2015Updated 10 years ago
- ☆16Sep 6, 2012Updated 13 years ago
- Yet Another Sequence Encoder - Encode sequences to vector of vector in python !☆13May 15, 2017Updated 8 years ago
- A simple and fast search engine☆70Jun 21, 2022Updated 3 years ago
- Source code of http://howihacked.info☆16Jan 28, 2016Updated 10 years ago
- Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipg…☆129Feb 5, 2026Updated 3 weeks ago
- C++ implementation of Generalised Brown clustering and python scripts for feature generation☆41Apr 8, 2016Updated 9 years ago
- Tools, wrappers, etc... for data science with a concentration on text processing☆207Nov 9, 2022Updated 3 years ago
- scalding powered machine learning☆109Nov 18, 2014Updated 11 years ago
- list of anything (Community driven list of anything) text :)☆30Feb 6, 2017Updated 9 years ago
- A method to mine beyond-pairwise relationships using Min-Hashing for large-scale pattern discovery☆28Oct 10, 2021Updated 4 years ago
- A data management tool for humans☆119Oct 31, 2016Updated 9 years ago
- Naive Bayesian Classifier written in APL☆24Jan 21, 2018Updated 8 years ago
- Curated set of transformers that make your work with steppy faster and more effective☆22Nov 22, 2018Updated 7 years ago
- ☆31Feb 13, 2026Updated 2 weeks ago
- A plotting library in Ruby built on top of Vega and D3.☆43Jun 22, 2025Updated 8 months ago
- Simple vim plugin to handle file transformations (E.g. auto gpg/aes/etc encrypt/decrypt, base64 encode/decode, etc)☆30Apr 5, 2019Updated 6 years ago
- A Clojure library that lets you chain processes and threads via pipes.☆33Feb 7, 2016Updated 10 years ago
- Clojure based LXS scene graph compiler, generator & mesh exporter for Luxrender☆76May 1, 2025Updated 9 months ago
- Python tool for normilizing text and text canonicalization (DISCONTINUED)☆41Sep 3, 2013Updated 12 years ago
- Show info about the author by facebook photo url☆40Apr 7, 2017Updated 8 years ago
- Chatbot for voice enable conversations☆10May 23, 2025Updated 9 months ago
- This repository contains the code of the Rasa workshop at PyData NYC 2018☆12Oct 19, 2018Updated 7 years ago
- flutter.io Demo Flutter ebay search☆12Mar 15, 2018Updated 7 years ago
- mybatis 通用crud,无需继承,直接调用。☆10Apr 25, 2017Updated 8 years ago
- A WeChat (and Weixin) chatbot skeleton in Python with queue/delayed messages support.☆12Jan 12, 2026Updated last month
- vertical search crawler☆38Jan 9, 2012Updated 14 years ago
- ☆18Jul 23, 2016Updated 9 years ago
- Extract annotated misspellings from MIMIC-III.☆13Dec 17, 2020Updated 5 years ago
- The official repo of BSIS☆14Feb 16, 2012Updated 14 years ago
- Python bindings for the NVML. Non-volatile memory for Python.☆12May 23, 2016Updated 9 years ago
- collection of modules to build distributed and reliable concurrent systems in Python.☆206Sep 14, 2013Updated 12 years ago
- Sourcecode & CAD drawings of NimbRo-OP☆27Oct 30, 2012Updated 13 years ago
- Automatic .gif creation from Youtube videos!☆56Dec 5, 2014Updated 11 years ago
- Translation of query languages to serialized KoralQuery protocol☆13Updated this week
- Tool for slot extraction from text☆15Oct 23, 2022Updated 3 years ago
- ERC20 Token for BitSong - https://bitsong.io☆10Jun 1, 2019Updated 6 years ago