morfologik / polimorfologikLinks
Scripts for preprocessing morfologik data.
☆40Updated 7 years ago
Alternatives and similar repositories for polimorfologik
Users that are interested in polimorfologik are comparing it to the libraries listed below
Sorting:
- Tools for finite state automata construction and dictionary-based morphological dictionaries. Includes Polish stemming dictionary.☆196Updated 2 years ago
- Stanford Tregex-inspired language for rule-based dependency tree manipulation.☆21Updated 8 years ago
- NER tagger for English, Spanish, Dutch, Italian and German and French.☆35Updated 9 years ago
- Zurich Morphological Lexicon for German: a tool to extract a morphological lexicon from Wiktionary☆11Updated 2 years ago
- Some convenient natural language tools that build on NLTK.☆85Updated 11 years ago
- A Corpus Data Retrieval Index using Lucene for Look-Ups☆17Updated this week
- Basic dataset for the linguistic data collection.☆15Updated 8 years ago
- Grapheme to phoneme toolkit using joint-modelling + CRFs in java☆14Updated 7 years ago
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆65Updated last year
- NameTag: Named Entity Tagger☆37Updated last year
- SMOR (Stuttgart Morphology) with alternative lemmatization component☆12Updated 2 years ago
- Course in Natural Language Processing and Applications☆10Updated 2 years ago
- Open morphology for Finnish☆92Updated last week
- Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl,…☆78Updated last month
- Lightning fast spell correction / fuzzy search library based on SymSpell by Commerce-Experts☆81Updated 7 years ago
- CRF-based Morphological Tagging and Lemmatization☆37Updated 5 years ago
- An alternative approach for probabilistic topic modeling based on agglomerative clustering of topics (not documents)☆12Updated 4 years ago
- GermaNER: Free Open German Named Entity Recognition Tool☆36Updated last year
- Partial Java port of the C++ OpenFST library☆37Updated 3 years ago
- finite-state toolkit, EM and Bayesian (Gibbs sampling) training for FST and context-free derivation forests☆41Updated 2 years ago
- Automatically exported from code.google.com/p/foma☆122Updated 6 months ago
- Wikipedia Bilingual Reference Data (English)☆15Updated 9 years ago
- ☆18Updated 10 years ago
- Baseform lemmatization for Elasticsearch☆26Updated 6 years ago
- WordNet RDF export☆24Updated 8 years ago
- Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipg…☆129Updated 8 months ago
- A library of examples showing how to use the Common Crawl corpus (2008-2012, ARC format)☆65Updated 9 years ago
- Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic pr…☆69Updated 2 months ago
- Helsinki Finite-State Technology (library and application suite)☆133Updated 3 months ago
- Unitex/GramLab Language Resources☆18Updated 3 years ago