ufal / udpipeLinks
UDPipe: Trainable pipeline for tokenizing, tagging, lemmatizing and parsing Universal Treebanks and other CoNLL-U files
β380Updated 7 months ago
Alternatives and similar repositories for udpipe
Users that are interested in udpipe are comparing it to the libraries listed below
Sorting:
- A CoNLL-U parser that takes a CoNLL-U formatted string and turns it into a nested python dictionary.β316Updated last week
- π₯ Use the latest Stanza (StanfordNLP) research models directly in spaCyβ734Updated 10 months ago
- spaCy + UDPipeβ161Updated 3 years ago
- A neural parsing pipeline for segmentation, morphological tagging, dependency parsing and lemmatization with pre-trained models for more β¦β113Updated last year
- English dataβ208Updated last week
- Various utilities for processing the data.β209Updated this week
- spacy-wordnet creates annotations that easily allow the use of wordnet and wordnet domains by using the nltk wordnet interfaceβ259Updated 9 months ago
- Universal Dependencies online documentationβ285Updated this week
- FreeLing project source codeβ257Updated 2 years ago
- German Morphological Analyzerβ47Updated 3 years ago
- Morfessor is a tool for unsupervised and semi-supervised morphological segmentationβ194Updated 4 years ago
- Neural network models for joint POS tagging and dependency parsing (CoNLL 2017-2018)β157Updated 6 years ago
- Automatically exported from code.google.com/p/universal-pos-tagsβ129Updated 3 years ago
- Text tokenization and sentence segmentation (segtok v2)β205Updated 3 years ago
- A single model that parses Universal Dependencies across 75 languages. Given a sentence, jointly predicts part-of-speech tags, morphologyβ¦β223Updated 2 years ago
- Annotated dataset of 100 works of fiction to support tasks in natural language processing and the computational humanities.β358Updated 2 years ago
- A minimal, pure Python library to interface with CoNLL-U format files.β151Updated 2 years ago
- Russian data from the SynTagRus corpus.β83Updated last week
- ConllEditor is a tool to edit dependency syntax trees in CoNLL-U format.β56Updated 3 weeks ago
- AmbiverseNLU: A Natural Language Understanding suite by Max Planck Institute for Informaticsβ210Updated last year
- Named Entity Recognition data for Europeana Newspapersβ171Updated 2 years ago
- A tokenizer and sentence splitter for German and English web and social media texts.β146Updated 6 months ago
- Language independent truecaser in Python.β160Updated 3 years ago
- Lexicon of frame files used by Propbank annotation. A searchable, readable version of the latest release is here: http://propbank.githubβ¦β100Updated last week
- Simple multilingual lemmatizer for Python, especially useful for speed and efficiencyβ163Updated 2 weeks ago
- Making sense embedding out of word embeddings using graph-based word sense inductionβ213Updated 4 years ago
- GitHub Typo Corpus: A Large-Scale Multilingual Dataset of Misspellings and Grammatical Errorsβ509Updated 5 years ago
- A multilingual parallel corpus created from translations of the Bible.β181Updated last month
- An implementation of a full named-entity evaluation metrics based on SemEval'13 Task 9 - not at tag/token level but considering all the tβ¦β222Updated 11 months ago
- Quickly extract multi-word phrases from a corpusβ191Updated 5 years ago