erikavaris / tokenizer
Tokenizer for Twitter and Reddit data
☆47Updated 6 years ago
Alternatives and similar repositories for tokenizer:
Users that are interested in tokenizer are comparing it to the libraries listed below
- A Dependency Parser for Tweets☆78Updated 5 years ago
- C++ implementation of Generalised Brown clustering and python scripts for feature generation☆41Updated 9 years ago
- A framework to identify relations between ideas in temporal text corpora.☆28Updated 7 years ago
- public repository of the interdisciplinary working group 'Hatespeech' of the research training group UCSM☆17Updated 6 years ago
- Sparse Additive Generative Model of Text☆87Updated 8 years ago
- Unsupervised method for extracting quotation-speaker pairs from large news corpora.☆29Updated 6 years ago
- Multi-Annotator Competence Estimation tool☆63Updated 5 years ago
- Code for learning geographically-informed word embeddings☆22Updated 3 years ago
- Incremental learning of word embeddings with context informativeness.☆94Updated last year
- Python port of the Twokenize class of ark-tweet-nlp☆142Updated 7 years ago
- Temporal Word Analogies in Python☆18Updated 7 years ago
- ☆104Updated 6 years ago
- The Broad Twitter Corpus, an NER dataset in English stratified for time, location, social media genre, socioeconomic factors (COLING 2016…☆68Updated 3 years ago
- Visualize word embeddings of a vocabulary in TensorBoard, including the neighbors☆46Updated 7 years ago
- Automatic labeling for topic model☆57Updated 9 years ago
- A Large Automatically-Constructed Resource of Predicate Paraphrases☆45Updated 5 years ago
- An Easy to Use, Accurate Python Geolocation Library☆41Updated 2 years ago
- Code to reproduce experiments from the EMNLP 2015 paper about Rumour Stance Classification with Gaussian Processes.☆37Updated 8 years ago
- ☆41Updated 8 years ago
- ☆56Updated 6 years ago
- Mining Argument Structures with Expressive Inference (Linear and LSTM Engines)☆65Updated 7 years ago
- ☆54Updated 3 years ago
- Preprocessing scripts to read definitions and other information from dictionaries☆22Updated 7 years ago
- Diverse Natural Language Inference Collection - NLI dataset that can used to evaluate how well models perform distinct types of reasoning…☆36Updated 4 years ago
- SemEval 2019 Hyperpartisan News Detection - team Bertha von Suttner contribution☆22Updated 5 years ago
- Corpus of Attribution-Annotated news articles covering the campaigns during the year leading up to the 2016 US Presidential election.☆20Updated 6 years ago
- Code to compute topic coherence for several topic cardinalities and aggregate scores across them☆21Updated 2 months ago
- A hierarchical character-word neural network for language identification☆15Updated 8 years ago
- Sentence specificity prediction☆25Updated 6 years ago
- Training Temporal Word Embeddings with a Compass☆64Updated 2 years ago