NICAR 2019 workshop on using Python and PDFplumber to extract text from PDFs
☆12Mar 9, 2019Updated 7 years ago
Alternatives and similar repositories for nicar-2019-pdfplumbing
Users that are interested in nicar-2019-pdfplumbing are comparing it to the libraries listed below
Sorting:
- CNN Transcripts 2000--2025☆23May 1, 2025Updated 10 months ago
- Install guides for IRE/NICAR conferences.☆16Mar 16, 2018Updated 7 years ago
- A work-in-progress guide showing how and why you should learn command-line tools (xsv, csvkit) to work with data☆19Mar 16, 2019Updated 6 years ago
- Accessing the Facebook Marketing API using httr in R, for demographic researchers☆21Nov 8, 2017Updated 8 years ago
- A simple Python wrapper for U.S. Census Geocoding Services API batch service☆43Nov 22, 2024Updated last year
- A hands-on course for NICAR 2020 (and 2018)☆11Mar 7, 2020Updated 6 years ago
- Code, data and slides for the UTokyo "text as data" course (June 3-4, 2017)☆11Jun 5, 2017Updated 8 years ago
- Tools for Statistical Content Analysis☆17Apr 22, 2025Updated 10 months ago
- Workbook to teach the concept of risk ratios for data journalism applications☆33Apr 15, 2022Updated 3 years ago
- Tools for downloading from the LexisNexis API☆17Apr 12, 2016Updated 9 years ago
- Materials for the lab component of DS-GA 1015 Text-as-Data (Spring 2019).☆18Jan 15, 2020Updated 6 years ago
- R library for accessing data from everypolitician.org☆20Apr 24, 2018Updated 7 years ago
- carebot-tracker.js — Carebot's tracking component for Google Analytics events☆17Apr 19, 2016Updated 9 years ago
- Parses Google Documents formatted for annotated transcripts –– with JavaScript☆18Feb 14, 2022Updated 4 years ago
- I/O, Transformation, and Analytical Routines for Twitter Data☆22Dec 22, 2020Updated 5 years ago
- Nicar ML/NLP workshop by J Kao☆19Mar 7, 2019Updated 7 years ago
- Modules for teaching about chatbots, voice interfaces and ai☆24Sep 10, 2025Updated 6 months ago
- Paper and related materials for Rodriguez & Spirling (JOP, 2022) word embeddings overview and assessment☆49Feb 14, 2022Updated 4 years ago
- A glossary of terms used in and around data science.☆23Apr 3, 2020Updated 5 years ago
- Walk through making basic charts — and a choropleth map — with this Altair tutorial.☆22Aug 23, 2022Updated 3 years ago
- An R package for trend analysis of time-series data☆22Jul 7, 2021Updated 4 years ago
- pneumatic is a bulk-upload library for DocumentCloud.☆22Sep 6, 2020Updated 5 years ago
- Text as Data 2019☆61Jun 5, 2019Updated 6 years ago
- nytimes: Interacting with New York TImes APIs☆27Aug 4, 2018Updated 7 years ago
- Extract all the fields from the NY Times Corpus to a csv☆27Jul 6, 2022Updated 3 years ago
- A step-by-step guide to publishing a standalone story from a dataset.☆30Jan 8, 2026Updated 2 months ago
- Serve AML documents pulled for Google Docs☆25Nov 18, 2021Updated 4 years ago
- ☆26Feb 10, 2025Updated last year
- ☆36Sep 26, 2022Updated 3 years ago
- Course materials: POIR 613 - Computational Social Science - USC Fall 2021☆31Nov 29, 2021Updated 4 years ago
- A quick repo with basic command line commands, plus a very brief CSVKit run through.☆33Mar 7, 2020Updated 6 years ago
- Material for a 3 day workshop on computational text analysis for humanists and social scientists☆34May 25, 2017Updated 8 years ago
- ME314 Introduction to Data Science and Big Data Analytics 2018☆10Jul 29, 2018Updated 7 years ago
- Reproducible code for our BMJ Open paper about county-level characteristics and equitable COVID-19 response.☆11Mar 16, 2021Updated 4 years ago
- Arabic News Stance Corpus☆11Feb 5, 2021Updated 5 years ago
- Tidyverse extensions for quanteda☆32Dec 16, 2025Updated 2 months ago
- Slides and code used in the lectures☆12Aug 12, 2019Updated 6 years ago
- R package for estimating speaker style distinctiveness in texts. Install it from CRAN!☆34Mar 4, 2021Updated 5 years ago
- Course materials: POIR 613 - Computational Social Science - USC Fall 2019☆31Dec 16, 2019Updated 6 years ago