proycon / LaMachineLinks
LaMachine - A software distribution of our in-house as well as some 3rd party NLP software - Virtual Machine, Docker, or local compilation/installation script
☆69Updated 2 years ago
Alternatives and similar repositories for LaMachine
Users that are interested in LaMachine are comparing it to the libraries listed below
Sorting:
- FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.g…☆113Updated 10 months ago
- FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (inclu…☆66Updated last year
- Python bindings to the dutch NLP tool Frog (pos tagger, lemmatiser, NER tagger, morphological analysis, shallow parser, dependency parser…☆49Updated 8 months ago
- Anonymization of legal cases (Fr) based on Flair embeddings☆87Updated 4 years ago
- Code and models for our CLEF-HIPE (Named Entity Processing on Historical Newspapers) submissions☆19Updated 2 years ago
- 🚀GUI for training spaCy models☆55Updated 4 years ago
- An intelligent reading agent that understands text and translates it into Wikidata statements.☆116Updated 9 years ago
- "Python Rule-based feAture sTructure Analysis" or "Python Rule-bAsed Text Analysis"☆70Updated 4 years ago
- Named Entities Recognition Annotator Tool for Europeana Newspapers☆61Updated 7 years ago
- A visualisation tool for Spacy using Hierplane.☆65Updated 2 years ago
- German Morphological Analyzer☆51Updated 4 years ago
- Parser for KAF NAF files written in Python☆16Updated 4 years ago
- spaCy-to-naf converter☆21Updated 5 months ago
- Language Tool style grammar handling with spaCy 2.0☆42Updated 7 years ago
- Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl,…☆79Updated 2 weeks ago
- A Named-Entity Recogniser based on Grobid.☆54Updated 6 months ago
- spaCy pipeline component for adding text readability meta data to Doc objects.☆56Updated 6 years ago
- Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic pr…☆70Updated 3 weeks ago
- linguistic converter / merging tool for multi-level annotated corpora. graph-based (using Python and NetworkX).☆50Updated 3 weeks ago
- Specification of NAF, the NLP annotation format☆21Updated 4 years ago
- Experiments to help discussion on Wikipedia talk pages☆68Updated last week
- UIMA CAS processing library written in Python☆90Updated 3 weeks ago
- Use spaCy for NLP and output to the FoLiA XML format.☆12Updated last year
- A set of workflows for corpus building through OCR, post-correction and normalisation☆49Updated 3 years ago
- Multi Tier Annotation Search☆26Updated 4 years ago
- A simple configurable tool for manipulating dependency trees.☆14Updated 11 months ago
- Named Entity Recognition based on dictionaries☆242Updated 6 years ago
- Normalizes lexically ill-formed text to its most likely clean text, e.g. "c u thr 2nite!" -> "see you there tonight!".☆63Updated 10 years ago
- Repository for the word embeddings experiments described in "Evaluating Unsupervised Dutch Word Embeddings as a Linguistic Resource", pre…☆84Updated 4 years ago
- Command-line tool to extract a ranked list of relevant keywords from a corpus with the option of using either topic modeling or tf-idf sc…☆40Updated 8 years ago