pushshift / Reddit-Bot-Detector
Script to extract highly probable bots for further analysis
☆12Updated 7 years ago
Related projects: ⓘ
- Read compressed NDJSON .zst files easily☆33Updated 2 years ago
- Stylometric framework in Python☆13Updated 9 years ago
- ☆21Updated this week
- Package for performing Reddit-based text analysis☆20Updated 5 years ago
- A high performance indexing and search system for managing big data☆17Updated 5 years ago
- A library to extract a publication date from a web page, along with a measure of the accuracy.☆42Updated 5 years ago
- Topic modelling with SpaCy, Gensim and Textacy☆19Updated 6 years ago
- Calculate readability scores☆40Updated 5 years ago
- ☆31Updated 9 years ago
- A visualisation tool for Spacy using Hierplane.☆65Updated last year
- Virtual patent marking crawler at iproduct.epfl.ch☆14Updated 7 years ago
- Jupyter notebook + Code for reproducing Reddit Subreddit graphs☆16Updated 8 years ago
- Find rss, atom, xml, and rdf feeds on webpages☆30Updated last year
- This library was created in order to evaluate the effectiveness of any kind of algorithm used in IR systems and analyze how well they per…☆15Updated 4 years ago
- Ensemble topic modeling with matrix factorization☆23Updated 6 years ago
- Second project for UW LING 572. Automatic text summarization system.☆14Updated 11 years ago
- Python package aiding in entity disambiguation based on string and location matching☆18Updated 10 months ago
- An alternative approach for probabilistic topic modeling based on agglomerative clustering of topics (not documents)☆12Updated 3 years ago
- Hidden alignment conditional random field for classifying string pairs.☆25Updated this week
- This is the text partitioner project for Python.☆20Updated 5 years ago
- A browser user interface for manual labeling of record pairs.☆41Updated last year
- ☆32Updated 10 months ago
- Active Learning for text classification using scikit-learn☆23Updated 5 years ago
- Project files related to topic modeling of NYT articles regarding mental health☆17Updated 6 years ago
- A selection of business datasets☆17Updated 5 years ago
- Python client for thegaurdian api☆59Updated 6 months ago
- Scrape the Twitter Frontend API without authentication.☆25Updated last year
- topic model visualization☆51Updated 9 years ago
- ☆42Updated 7 years ago
- OSoMe API mashups☆11Updated 5 years ago