ietz / nytimes-scraper
Scrape articles and comments from NYTimes
☆20Updated last year
Alternatives and similar repositories for nytimes-scraper:
Users that are interested in nytimes-scraper are comparing it to the libraries listed below
- A multithread Pushshift.io API Wrapper for reddit.com comment and submission searches.☆218Updated 2 years ago
- Cleans Reddit Text Data☆83Updated 5 years ago
- Measure the readability of a given text using surface characteristics☆79Updated 3 months ago
- Tag news stories based on models trained on the NYT corpus.☆42Updated 2 years ago
- A TextBlob sentiment analysis pipeline component for spaCy.☆56Updated 6 months ago
- Dataframe Integration with spaCy.☆103Updated 4 years ago
- Target-dependent sentiment classification in news articles reporting on political events. Includes a high-quality data set of over 11k se…☆151Updated last year
- ☆53Updated 2 years ago
- ☆22Updated 4 years ago
- An on-going dataset consisting of hashtags, n-gram counts and other misc NLP things for covid-19 analysis, stemming from over 100 000 000…☆57Updated 3 years ago
- A Python wrapper around the topic modeling functions of MALLET.☆101Updated 6 months ago
- A Python API to the Internet Archive Wayback Machine☆71Updated 9 months ago
- HDBSCAN Tuning for BERTopic Models☆45Updated last year
- Concept Modeling: Topic Modeling on Images and Text☆206Updated 6 months ago
- Dataset: BuzzFeed News “Trending” Strip, 2018–2023☆19Updated last year
- Driver for LIWC2015 analysis. LIWC2015 dictionary not included.☆16Updated 2 years ago
- Datasets for fake news and misinformation detection☆66Updated last year
- Asent is a python library for performing efficient and transparent sentiment analysis using spaCy.☆118Updated last year
- Using the Gmail API to topic model my recommended Medium reads☆24Updated 3 years ago
- Easy PDF to text to spaCy text extraction in Python.☆39Updated 7 months ago
- BirdSpotter is a python package which provides an influence and bot detection toolkit for twitter.☆19Updated 4 years ago
- Pushshift Telegram Ingest☆86Updated 5 years ago
- Calculate readability scores☆41Updated 6 years ago
- An affect generator based on TextBlob and the NRC affect lexicon. Note that lexicon license is for research purposes only.☆71Updated 2 years ago
- A module to compute textual lexical richness (aka lexical diversity).☆106Updated last year
- ☆56Updated 2 years ago
- A tool for Semantic Scaling of Political Text (branch of Topfish, a suite of tools for Political Text Analysis)☆27Updated last year
- Fast and robust date extraction from web pages, with Python or on the command-line☆126Updated 4 months ago
- A python package to enrich Twitter Data☆75Updated last year
- Ethical, legal, and effortless extraction of Reddit data in your database☆68Updated 7 months ago