erikavaris / tokenizerLinks
Tokenizer for Twitter and Reddit data
☆46Updated 6 years ago
Alternatives and similar repositories for tokenizer
Users that are interested in tokenizer are comparing it to the libraries listed below
Sorting:
- C++ implementation of Generalised Brown clustering and python scripts for feature generation☆41Updated 9 years ago
- A Dependency Parser for Tweets☆78Updated 5 years ago
- Multi-Annotator Competence Estimation tool☆63Updated 5 years ago
- ☆103Updated 6 years ago
- Sparse Additive Generative Model of Text☆87Updated 8 years ago
- The Argument Reasoning Comprehension Task: Source codes & Datasets☆74Updated 3 years ago
- public repository of the interdisciplinary working group 'Hatespeech' of the research training group UCSM☆17Updated 6 years ago
- Code to reproduce experiments from the EMNLP 2015 paper about Rumour Stance Classification with Gaussian Processes.☆37Updated 9 years ago
- A framework to identify relations between ideas in temporal text corpora.☆28Updated 7 years ago
- The Broad Twitter Corpus, an NER dataset in English stratified for time, location, social media genre, socioeconomic factors (COLING 2016…☆68Updated 3 years ago
- ☆44Updated 7 years ago
- Code and data related to "Efficient, Compositional, Order-Sensitive n-gram Embeddings" (EACL 2017)☆14Updated 8 years ago
- Utility scripts in Python☆37Updated last week
- Unsupervised method for extracting quotation-speaker pairs from large news corpora.☆29Updated 6 years ago
- Corpus and annotations for the CL-Aff Shared Task from the University of Pennsylvania☆19Updated 3 years ago
- Code for learning geographically-informed word embeddings☆22Updated 3 years ago
- scripts and data for ACL 16 paper☆14Updated 8 years ago
- ☆16Updated 6 years ago
- A hierarchical character-word neural network for language identification☆15Updated 8 years ago
- Code and data for ACL2016 article "Which argument is more convincing? Analyzing and predicting convincingness of Web arguments using bidi…☆28Updated 8 years ago
- A Large Automatically-Constructed Resource of Predicate Paraphrases☆45Updated 5 years ago
- Diverse Natural Language Inference Collection - NLI dataset that can used to evaluate how well models perform distinct types of reasoning…☆36Updated 4 years ago
- Active Learning for text classification using scikit-learn☆24Updated 6 years ago
- Code and data for the WSDM '19 paper "Crosslingual Document Embedding as Reduced-Rank Ridge Regression (Cr5)"☆30Updated 5 years ago
- Non-distributional linguistic word vector representations.☆62Updated 7 years ago
- Visualize word embeddings of a vocabulary in TensorBoard, including the neighbors☆46Updated 7 years ago
- Python port of the Twokenize class of ark-tweet-nlp☆142Updated 7 years ago
- ☆34Updated 3 years ago
- Sentence specificity prediction☆25Updated 6 years ago
- A set of media framing annotations, along with scripts for obtaining the corresponding news articles☆52Updated 6 years ago