OFAI / million-post-corpus
Annotated data set consisting of user comments posted to a German-language newspaper website
☆17Updated 6 years ago
Alternatives and similar repositories for million-post-corpus
Users that are interested in million-post-corpus are comparing it to the libraries listed below
Sorting:
- Corpus of Attribution-Annotated news articles covering the campaigns during the year leading up to the 2016 US Presidential election.☆20Updated 6 years ago
- The Potsdam Twitter Sentiment Corpus☆17Updated 5 years ago
- linguistic converter / merging tool for multi-level annotated corpora. graph-based (using Python and NetworkX).☆51Updated 2 years ago
- Jupyter extension to visualize dependency structures☆28Updated 7 years ago
- Repository for our ACL 2020 paper "Learning and Evaluating Emotion Lexicons for 91 Languages"☆26Updated 2 years ago
- ☆104Updated 6 years ago
- A framework to identify relations between ideas in temporal text corpora.☆28Updated 7 years ago
- The Broad Twitter Corpus, an NER dataset in English stratified for time, location, social media genre, socioeconomic factors (COLING 2016…☆68Updated 3 years ago
- Workshop on Noisy User-generated Text (W-NUT)☆30Updated last week
- Toolkit to compile a comparable/parallel corpus from European Parliament proceedings☆16Updated 5 years ago
- CONLL-U to Pandas DataFrame☆31Updated 7 years ago
- KenLM extension for spaCy 2.0.☆16Updated 7 years ago
- Mining Discourse Markers for Unsupervised Sentence Representation Learning☆60Updated last year
- Sume is an implementation of the concept-based ILP model for summarization.☆37Updated 6 years ago
- public repository of the interdisciplinary working group 'Hatespeech' of the research training group UCSM☆17Updated 6 years ago
- Language Model and Text Classification for German Language using Deep Learning☆18Updated 6 years ago
- The Universal Decompositional Semantics (UDS) dataset and the Decomp toolkit☆57Updated last year
- Metaphor dataset: literal versus non-literal uses of words☆14Updated 9 years ago
- Analyze Argumentation and Rhetorical Aspects in Scientific Writing.☆19Updated 2 years ago
- Training Temporal Word Embeddings with a Compass☆64Updated 2 years ago
- ☆11Updated 5 years ago
- Sentence specificity prediction☆25Updated 6 years ago
- ☆22Updated last year
- Code for learning geographically-informed word embeddings☆22Updated 3 years ago
- Unsupervised method for extracting quotation-speaker pairs from large news corpora.☆29Updated 6 years ago
- Neural topic modeling☆29Updated 4 years ago
- An annotated corpus of argumentative microtexts☆39Updated 2 years ago
- Alignment and annotation for comparable documents.☆22Updated 6 years ago
- fork of Vanessa Wei Feng's RST-style discourse parser☆13Updated 4 years ago
- Code and data related to "Efficient, Compositional, Order-Sensitive n-gram Embeddings" (EACL 2017)☆14Updated 8 years ago