dewarim / data-tools-for-reddit
Tools to work with the big reddit JSON data dump.
β250Updated 7 months ago
Alternatives and similar repositories for data-tools-for-reddit:
Users that are interested in data-tools-for-reddit are comparing it to the libraries listed below
- Python wrapper for Stanford CoreNLP toolsβ58Updated 9 years ago
- High-coverage and high-precision lexica of terms annotated with emotion scores for English and Italian.β152Updated 3 months ago
- π« Scripts, tools and resources for developing spaCyβ125Updated 5 years ago
- Similarity search on Wikipedia using gensim in Python.β60Updated 6 years ago
- Twitter hashtag predictionβ281Updated 7 years ago
- Socially-Equitable Language Identificationβ78Updated last year
- Python port of Mikolov's word2phrase.c from the word2vec toolkitβ111Updated 4 years ago
- A large corpus of discourse annotations and relations on ~10K forum threads.β238Updated 6 years ago
- The repository contains code walkthroughs which introduces Deep Learning in the field of Natural Language Processing.β109Updated 8 years ago
- β151Updated 5 years ago
- Practical Natural Language Processing Tools for Humans. Dependency Parsing, Syntactic Constituent Parsing, Semantic Role Labeling, Named β¦β193Updated 7 years ago
- A python library for simple text summarizationβ219Updated 9 years ago
- Python wrapper for Stanford CoreNLPβ354Updated 4 years ago
- Python port of the Twokenize class of ark-tweet-nlpβ141Updated 6 years ago
- This repository contains the three WikiReading datasets as used and described in WikiReading: A Novel Large-scale Language Understanding β¦β270Updated 6 years ago
- Tokenization and pre-processing for Twitter data used to train classifiers.β71Updated 8 years ago
- Uses Recurrent Neural Network (LSTM/GRU/basic_RNN units) for summarization of amazon reviewsβ132Updated 7 years ago
- Twitter named entity extraction for WNUT 2016 http://noisy-text.github.io/2016/ner-shared-task.htmlβ138Updated 2 years ago
- This is a mirror of the script by Giuseppe Attardi, and contains history before the official repo started: https://github.com/attardi/wikβ¦β259Updated 8 years ago
- The Berkeley Entity Resolution System jointly solves the problems of named entity recognition, coreference resolution, and entity linkingβ¦β185Updated 5 years ago
- Finding document vectors from pre-trained word2vec word vectorsβ115Updated 9 years ago
- Subjectivity and sentiment classification using polarity lexiconsβ88Updated 3 years ago
- Word2Vec models with Twitter data using Spark. Blog:β65Updated 6 years ago
- My personal blogβ51Updated 4 years ago
- Scripts and modules used for creating document clusters from word2vecβ40Updated 8 years ago
- Stanford NLP group's shared Python tools.β137Updated 6 years ago
- An introduction to using spaCy for NLP and machine learningβ191Updated 2 years ago
- Code for the blog post "Making Sense of Word2vec"β113Updated 10 years ago
- A Multilingual and Multilevel Representation Learning Toolkit for NLPβ116Updated 7 years ago
- Some labeled training and test data for email intent machine learning (based on sentence-level speech acts)β108Updated 10 years ago