OFAI / million-post-corpusLinks
Annotated data set consisting of user comments posted to a German-language newspaper website
☆17Updated 6 years ago
Alternatives and similar repositories for million-post-corpus
Users that are interested in million-post-corpus are comparing it to the libraries listed below
Sorting:
- The Potsdam Twitter Sentiment Corpus☆17Updated 5 years ago
- KenLM extension for spaCy 2.0.☆16Updated 7 years ago
- Workshop on Noisy User-generated Text (W-NUT)☆30Updated last month
- SMOR (Stuttgart Morphology) with alternative lemmatization component☆12Updated last year
- ☆33Updated 3 years ago
- Corpus of Attribution-Annotated news articles covering the campaigns during the year leading up to the 2016 US Presidential election.☆20Updated 7 years ago
- public repository of the interdisciplinary working group 'Hatespeech' of the research training group UCSM☆17Updated 6 years ago
- numeric fused-head identification and resolution☆33Updated 5 years ago
- Python interface for converting Penn Treebank trees to Stanford Dependencies and Universal Depenencies☆70Updated 6 years ago
- This projects hosts an annotated dataset of 39 transcripts of United States presidential election debates annotated with argument compone…☆12Updated 6 years ago
- Coreference resolution for German☆16Updated 8 years ago
- ☆64Updated 2 years ago
- A simple neural truecaser written in pytorch and allennlp.☆33Updated last year
- Jupyter extension to visualize dependency structures☆28Updated 7 years ago
- Processing the MPQA Corpus☆27Updated 6 years ago
- Sume is an implementation of the concept-based ILP model for summarization.☆37Updated 6 years ago
- CoNLL 2018 Shared Task Team UDPipe-Future☆39Updated 4 years ago
- An updated version of the Parser-v1 repo, used for Stanford's submission in the CoNLL17 shared task.☆47Updated 6 years ago
- C++ implementation of Generalised Brown clustering and python scripts for feature generation☆41Updated 9 years ago
- Code for learning geographically-informed word embeddings☆22Updated 3 years ago
- Dict2vec is a framework to learn word embeddings using lexical dictionaries.☆114Updated 4 years ago
- ☆103Updated 6 years ago
- A Large Automatically-Constructed Resource of Predicate Paraphrases☆45Updated 5 years ago
- linguistic converter / merging tool for multi-level annotated corpora. graph-based (using Python and NetworkX).☆51Updated 2 years ago
- several algorithms for converting dependency structures into constituency structures.☆10Updated 3 years ago
- The Attract-Repel algorithm presented in (Mrkšić et al., TACL 2017), with accompanying resources.☆63Updated 7 years ago
- A framework to identify relations between ideas in temporal text corpora.☆28Updated 7 years ago
- Twpipe is a pipeline toolkit that parses raw tweets into universal dependencies.☆28Updated 6 years ago
- ☆54Updated 3 years ago
- Bayesian Skip-gram model☆47Updated 5 years ago