erikavaris / tokenizer
Tokenizer for Twitter and Reddit data
☆45Updated 5 years ago
Alternatives and similar repositories for tokenizer:
Users that are interested in tokenizer are comparing it to the libraries listed below
- A Dependency Parser for Tweets☆79Updated 5 years ago
- Sparse Additive Generative Model of Text☆87Updated 8 years ago
- ☆104Updated 6 years ago
- public repository of the interdisciplinary working group 'Hatespeech' of the research training group UCSM☆17Updated 6 years ago
- A framework to identify relations between ideas in temporal text corpora.☆28Updated 6 years ago
- Unsupervised method for extracting quotation-speaker pairs from large news corpora.☆29Updated 6 years ago
- C++ implementation of Generalised Brown clustering and python scripts for feature generation☆41Updated 8 years ago
- Sentence specificity prediction☆25Updated 6 years ago
- Multi-Annotator Competence Estimation tool☆63Updated 5 years ago
- Temporal Word Analogies in Python☆18Updated 7 years ago
- Neural topic modeling☆29Updated 4 years ago
- Extract all the fields from the NY Times Corpus to a csv☆27Updated 2 years ago
- annotated hateful speech☆25Updated 5 years ago
- Quick implementation of Monroe et al.'s algorithm for comparing languages☆53Updated 4 years ago
- ☆41Updated 8 years ago
- The Yahoo News Annotated Comments Corpus (YNACC)☆18Updated 6 years ago
- Code to compute topic coherence for several topic cardinalities and aggregate scores across them☆22Updated last year
- ☆54Updated 3 years ago
- Repository for code and metadata to support work described in "Authorless Topic Models: Biasing Models Away from Known Structure"☆28Updated 4 years ago
- A set of media framing annotations, along with scripts for obtaining the corresponding news articles☆50Updated 5 years ago
- A Large Automatically-Constructed Resource of Predicate Paraphrases☆42Updated 4 years ago
- The Broad Twitter Corpus, an NER dataset in English stratified for time, location, social media genre, socioeconomic factors (COLING 2016…☆66Updated 2 years ago
- The Attract-Repel algorithm presented in (Mrkšić et al., TACL 2017), with accompanying resources.☆63Updated 7 years ago
- ☆11Updated 7 years ago
- ☆22Updated last year
- Harassment Lexicon and Corpus☆29Updated 6 years ago
- An Easy to Use, Accurate Python Geolocation Library☆40Updated 2 years ago
- ☆54Updated 9 years ago
- Utility scripts in Python☆37Updated 5 months ago
- Collaborative web framework for analyzing text (e.g., tweets). Supports standard labeling and pairwise comparison.☆14Updated 3 years ago