dewarim / data-tools-for-redditLinks

Tools to work with the big reddit JSON data dump.

☆256

Alternatives and similar repositories for data-tools-for-reddit

Users that are interested in data-tools-for-reddit are comparing it to the libraries listed below

Sorting:

google-research-datasets / coarse-discourse
A large corpus of discourse annotations and relations on ~10K forum threads.
☆241Updated 7 years ago
myleott / ark-twokenize-py
Python port of the Twokenize class of ark-tweet-nlp
☆142Updated 7 years ago
Wordseer / stanford-corenlp-python
Python wrapper for Stanford CoreNLP tools
☆58Updated 10 years ago
stanfordnlp / stanza-old
Stanford NLP group's shared Python tools.
☆136Updated 7 years ago
dimazest / google-ngram-downloader
☆98Updated 4 years ago
davidjurgens / equilid
Socially-Equitable Language Identification
☆78Updated 2 years ago
brendano / stanford_corenlp_pywrapper
☆151Updated 6 years ago
pavlobaron / wpcorpus
wpcorpus - NLP corpus based on Wikipedia's full article dump
☆97Updated 10 years ago
google-research-datasets / wiki-reading
This repository contains the three WikiReading datasets as used and described in WikiReading: A Novel Large-scale Language Understanding …
☆271Updated 7 years ago
bdhingra / tweet2vec
Twitter hashtag prediction
☆282Updated 8 years ago
jwieting / charagram
Code to train and use models from "Charagram: Embedding Words and Sentences via Character n-grams".
☆124Updated 9 years ago
eyaler / word2vec-slim
word2vec Google News model slimmed down to 300k English words
☆215Updated 8 years ago
biplab-iitb / practNLPTools
Practical Natural Language Processing Tools for Humans. Dependency Parsing, Syntactic Constituent Parsing, Semantic Role Labeling, Named …
☆194Updated 8 years ago
jaredks / tweetokenize
Tokenization and pre-processing for Twitter data used to train classifiers.
☆72Updated 9 years ago
travisbrady / word2phrase
Python port of Mikolov's word2phrase.c from the word2vec toolkit
☆111Updated 5 years ago
Pinafore / qb
QANTA Quiz Bowl AI
☆171Updated 2 months ago
heerme / twitter-topics
Python code for detecting topics/events from a Twitter stream
☆100Updated 7 years ago
marcoguerini / DepecheMood
High-coverage and high-precision lexica of terms annotated with emotion scores for English and Italian.
☆155Updated last year
ParakweetLabs / EmailIntentDataSet
Some labeled training and test data for email intent machine learning (based on sentence-level speech acts)
☆115Updated 11 years ago
explosion / spacy-dev-resources
💫 Scripts, tools and resources for developing spaCy
☆126Updated 6 years ago
jayantj / w2vec-similarity
Scripts and modules used for creating document clusters from word2vec
☆40Updated 9 years ago
markriedl / WikiPlots
A dataset containing story plots from Wikipedia (books, movies, etc.) and the code for the extractor.
☆319Updated 8 years ago
hal3 / vwnlp
Solving NLP problems with Vowpal Wabbit: Tutorial and more
☆183Updated 9 years ago
bwbaugh / wikipedia-extractor
This is a mirror of the script by Giuseppe Attardi, and contains history before the official repo started: https://github.com/attardi/wik…
☆260Updated 9 years ago
alvations / awesome-community-curated-nlp
Community Curated NLP List
☆201Updated 3 years ago
xiaohan2012 / twitter-sent-dnn
Deep Neural Network for Sentiment Analysis on Twitter
☆276Updated 3 years ago
nik0spapp / usent
Subjectivity and sentiment classification using polarity lexicons
☆91Updated 4 years ago
jperla / sentiment-data
sentiment analysis datasets
☆98Updated 13 years ago
chrisjmccormick / wiki-sim-search
Similarity search on Wikipedia using gensim in Python.
☆60Updated 7 years ago
smilli / py-corenlp
Python wrapper for Stanford CoreNLP
☆355Updated 5 years ago