notnews/nytimes-corpus-extractor

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/notnews/nytimes-corpus-extractor)

notnews / nytimes-corpus-extractor

Extract all the fields from the NY Times Corpus to a csv

☆27

Alternatives and similar repositories for nytimes-corpus-extractor

Users that are interested in nytimes-corpus-extractor are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

outerproduct / nyt-summ
View on GitHub
Summarization datasets from the New York Times Annotated Corpus
☆48Aug 27, 2020Updated 5 years ago
jaeyk / tidyethnicnews
View on GitHub
R package for turning Ethnic NewsWatch search results into tidyverse-ready dataframes
☆11Dec 7, 2021Updated 4 years ago
justingrimmer / TAD
View on GitHub
This is the public repository for my quarter long text as data course
☆17Mar 5, 2018Updated 8 years ago
notnews / good_nyt
View on GitHub
Patterns in NYT production from 1987 to 2007
☆11Nov 6, 2017Updated 8 years ago
kmunger / Topic_Models
View on GitHub
Presentation for the NYU Data Lab December 2015
☆14Dec 2, 2015Updated 10 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
srkobakian / taipan
View on GitHub
Image Annotations in R
☆32Oct 30, 2019Updated 6 years ago
MarHai / ScrapeBot
View on GitHub
A Selenium-driven tool for automated website interaction and scraping.
☆20Sep 1, 2021Updated 4 years ago
kbenoit / CSTA-APSR
View on GitHub
Replication Materials for "Crowd-Sourced Text Analysis" APSR (2016) 110(2): 278-295.
☆11Oct 28, 2017Updated 8 years ago
TaddyLab / maptpx
View on GitHub
map estimation of topic models
☆19May 27, 2020Updated 6 years ago
dannguyen / nicar-2019-pdfplumbing
View on GitHub
NICAR 2019 workshop on using Python and PDFplumber to extract text from PDFs
☆12Mar 9, 2019Updated 7 years ago
ccs-amsterdam / compendium
View on GitHub
Research compendium for reproducible research
☆12Sep 7, 2020Updated 5 years ago
cloudyr / aws.alexa
View on GitHub
Client Package for the Amazon Alexa Web Information Service
☆13May 30, 2022Updated 4 years ago
dannguyen / learn-data-csv-cli
View on GitHub
A work-in-progress guide showing how and why you should learn command-line tools (xsv, csvkit) to work with data
☆19Mar 16, 2019Updated 7 years ago
ajparsons / everypoliticianR
View on GitHub
R library for accessing data from everypolitician.org
☆20Apr 24, 2018Updated 8 years ago
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
mkearney / nyt
View on GitHub
📰🗞 New York Times data
☆12Aug 4, 2018Updated 7 years ago
mimno / PyMallet
View on GitHub
Python tools for text
☆16May 8, 2020Updated 6 years ago
Docma-TU / tosca
View on GitHub
Tools for Statistical Content Analysis
☆18Apr 22, 2025Updated last year
vals / Reading-PCA
View on GitHub
☆16Jun 11, 2017Updated 9 years ago
Shahul-Rahman / SPGD-Search-Party-Gradient-Descent-algorithm
View on GitHub
SPGD: Search Party Gradient Descent algorithm, a Simple Gradient-Based Parallel Algorithm for Bound-Constrained Optimization. Link: http…
☆11Oct 28, 2023Updated 2 years ago
felipebravom / EmoInt
View on GitHub
Scripts for WASSA-2017 Shared Task on Emotion Intensity
☆14Oct 4, 2017Updated 8 years ago
kasperwelbers / corpustools
View on GitHub
An R corpus class for tokenized texts
☆32Jul 10, 2025Updated last year
ftlabs / text-summarization-experiment
View on GitHub
Experiment on text summarization techniques and exploring Tensorflow.
☆15Apr 25, 2017Updated 9 years ago
HassanElmadany / Extract-SVO
View on GitHub
A python sript to extract subject-predicate-object (SVO) triplets from English sentences using Stanford Parser according to the following…
☆20Sep 16, 2017Updated 8 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
justingrimmer / WUSTL
View on GitHub
Text as Data Material for WashU Course
☆15Nov 7, 2017Updated 8 years ago
mannau / boilerpipeR
View on GitHub
Interface to the boilerpipe Java library by Christian Kohlschutter (http://code.google.com/p/boilerpipe/)
☆21May 19, 2021Updated 5 years ago
notnews / cnn_transcripts
View on GitHub
CNN Transcripts 2000--2025
☆26May 1, 2025Updated last year
krobertslab / pretrained-clinical-embeddings
View on GitHub
Resourses of pre-trained word representations on clinical texts.
☆12Jul 31, 2019Updated 6 years ago
hrbrmstr / urlscan
View on GitHub
👀 Analyze Websites and Resources They Request
☆24Feb 3, 2019Updated 7 years ago
dantonnoriega / PubPol590-Sp15
View on GitHub
☆19Aug 25, 2023Updated 2 years ago
ccgilroy / r-estimates-fb-ads
View on GitHub
Accessing the Facebook Marketing API using httr in R, for demographic researchers
☆21Nov 8, 2017Updated 8 years ago
lmcinnes / subreddit_mapping
View on GitHub
Notebooks and data associated to constructing and exploring a map of subreddits.
☆56Apr 24, 2017Updated 9 years ago
gitronald / domains
View on GitHub
Repository of data on web domains.
☆19May 24, 2023Updated 3 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
uma-pi1 / lash
View on GitHub
Large-Scale Sequence Mining with Hierarchies
☆13Mar 13, 2015Updated 11 years ago
SMAPPNYU / smapp-toolkit
View on GitHub
Python library for interacting with smapp collections
☆19May 30, 2016Updated 10 years ago
trinker / textplot
View on GitHub
Plotting for text data
☆19Sep 23, 2017Updated 8 years ago
cjbarrie / sicss_23
View on GitHub
Repository of materials for SICSS-Edinburgh, 2023.
☆12Jun 19, 2023Updated 3 years ago
yinleon / LocalNewsDataset
View on GitHub
The documentation and scripts for the Local News Dataset
☆25Apr 14, 2022Updated 4 years ago
JosemyDuarte / gpt-researcher-ollama
View on GitHub
Based on assafelovic/gpt-researcher - Modified to support local Ollama models
☆16May 15, 2024Updated 2 years ago
MichaelKreil / twitter-analysis2
View on GitHub
☆12Apr 12, 2023Updated 3 years ago