OFAI / million-post-corpusLinks
Annotated data set consisting of user comments posted to a German-language newspaper website
☆17Updated 7 years ago
Alternatives and similar repositories for million-post-corpus
Users that are interested in million-post-corpus are comparing it to the libraries listed below
Sorting:
- ☆103Updated 6 years ago
- Doing things with embeddings☆66Updated 2 years ago
- A Dependency Parser for Tweets☆78Updated 5 years ago
- Fast supervised sentence boundary detection using the averaged perceptron☆90Updated 6 years ago
- Dict2vec is a framework to learn word embeddings using lexical dictionaries.☆114Updated 4 years ago
- ☆55Updated 10 years ago
- KenLM extension for spaCy 2.0.☆16Updated 7 years ago
- Tokenizer for Twitter and Reddit data☆46Updated 6 years ago
- The Broad Twitter Corpus, an NER dataset in English stratified for time, location, social media genre, socioeconomic factors (COLING 2016…☆68Updated 3 years ago
- Repository for the word embeddings experiments described in "Evaluating Unsupervised Dutch Word Embeddings as a Linguistic Resource", pre…☆83Updated 4 years ago
- C++ implementation of Generalised Brown clustering and python scripts for feature generation☆41Updated 9 years ago
- Mining Argument Structures with Expressive Inference (Linear and LSTM Engines)☆66Updated 7 years ago
- Workshop on Noisy User-generated Text (W-NUT)☆30Updated 2 months ago
- CONLL-U to Pandas DataFrame☆31Updated 7 years ago
- Hierarchical word clustering, following "Brown clustering" (Brown et al., 1992)☆69Updated 10 years ago
- PredPatt: Predicate-Argument Extraction from Universal Dependencies☆112Updated 4 years ago
- Utility scripts in Python☆37Updated last month
- Bidirectional Long-Short Term Memory tagger (bi-LSTM) (in DyNet) -- hierarchical (with word and character embeddings)☆122Updated 2 years ago
- ☆25Updated 5 years ago
- Incremental learning of word embeddings with context informativeness.☆94Updated 2 years ago
- LexNET: Integrated Path-based and Distributional Method for Lexical Semantic Relation Classification☆62Updated 6 years ago
- Keras implementation of ontology aware token embeddings☆49Updated 6 years ago
- Jupyter extension to visualize dependency structures☆28Updated 7 years ago
- Sume is an implementation of the concept-based ILP model for summarization.☆37Updated 6 years ago
- A natural language processing tool for automatically detecting quotations in text.☆15Updated 3 years ago
- Making sense embedding out of word embeddings using graph-based word sense induction☆213Updated 4 years ago
- A way to do annotations for NER. TALEN: Tool for Annotation of Low-resource ENtities☆115Updated last week
- A collection of English tweets annotated in Universal Dependencies.☆39Updated 3 years ago
- Repository for our ACL 2020 paper "Learning and Evaluating Emotion Lexicons for 91 Languages"☆27Updated 2 years ago
- Twitter named entity extraction for WNUT 2016 http://noisy-text.github.io/2016/ner-shared-task.html☆139Updated 2 years ago