LanguageMachines / uctoLinks
Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic preprocessing steps such as changing case that you can all use to make your text suited for further processing such as indexing, part-of-speech tagging, or machine translation. Ucto comes with tokenisation rules…
☆70Updated last week
Alternatives and similar repositories for ucto
Users that are interested in ucto are comparing it to the libraries listed below
Sorting:
- FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.g…☆113Updated 9 months ago
- Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipg…☆129Updated 11 months ago
- Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl,…☆79Updated 3 months ago
- Ukb: graph-based WSD and similarity☆106Updated last year
- Various utilities for processing the data.☆213Updated last week
- Named Entity Recognition data for Europeana Newspapers☆173Updated 2 years ago
- TiMBL implements several memory-based learning algorithms.☆53Updated this week
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆66Updated last year
- FreeLing project source code☆260Updated 2 years ago
- Learning by Reading pipeline of NLP and Entity Linking tools☆85Updated 2 years ago
- 🆕 Work continues on INCEpTION 👉 https://github.com/inception-project/inception 👈 -- ⚠️ The official WebAnno repository has reached the…☆249Updated 2 years ago
- Universal Dependencies online documentation☆287Updated this week
- Morfessor is a tool for unsupervised and semi-supervised morphological segmentation☆197Updated 5 years ago
- GermaNER: Free Open German Named Entity Recognition Tool☆36Updated last year
- Software and resources for natural language processing.☆131Updated 9 years ago
- General-Purpose Neural Networks for Sentence Boundary Detection☆73Updated 2 years ago
- Python bindings to the dutch NLP tool Frog (pos tagger, lemmatiser, NER tagger, morphological analysis, shallow parser, dependency parser…☆49Updated 8 months ago
- ConllEditor is a tool to edit dependency syntax trees in CoNLL-U format.☆57Updated this week
- German Morphological Analyzer☆50Updated 4 years ago
- Named Entities Recognition Annotator Tool for Europeana Newspapers☆61Updated 7 years ago
- Multi Tier Annotation Search☆26Updated 4 years ago
- spaCy-to-naf converter☆21Updated 5 months ago
- Thot toolkit for statistical machine translation☆53Updated 3 years ago
- BLLIP reranking parser (also known as Charniak-Johnson parser, Charniak parser, Brown reranking parser) See http://pypi.python.org/pypi/b…☆228Updated 4 years ago
- A simple configurable tool for manipulating dependency trees.☆14Updated 10 months ago
- Federated Knowledge Extraction Framework☆193Updated 2 years ago
- Normalizes lexically ill-formed text to its most likely clean text, e.g. "c u thr 2nite!" -> "see you there tonight!".☆63Updated 10 years ago
- A multilingual dependency parser based on linear programming relaxations.☆115Updated 6 years ago
- This repository contains the Framester resource, the main outcome of the framester project.☆33Updated 3 weeks ago
- Specification of NAF, the NLP annotation format☆21Updated 4 years ago