ETL of newspaper article keywords using Apache Airflow, Newspaper3k, Quilt T4 and AWS S3
☆16May 20, 2026Updated last week
Alternatives and similar repositories for etl-airflow-s3
Users that are interested in etl-airflow-s3 are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Pre-built template for using newspaper3k on aws lambda☆17Dec 9, 2022Updated 3 years ago
- Format a number with commas☆12Dec 22, 2016Updated 9 years ago
- ☆27Mar 27, 2016Updated 10 years ago
- A structured record metaclass for Python.☆12Jan 21, 2012Updated 14 years ago
- MOTHBALLED: See README note.☆10Jul 11, 2022Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Find the unique columns in a tabular dataset.☆13Jan 13, 2016Updated 10 years ago
- Tabs on Tallahassee☆11Dec 5, 2016Updated 9 years ago
- A dashboard for issue and pull request management in whatwg/html☆14Jun 22, 2016Updated 9 years ago
- Data on Digital Media and Technology Expenditures in the United States Congress☆10Jul 17, 2017Updated 8 years ago
- presentation for nicar 2011 (an exploration into the concepts behind backbone.js)☆12Feb 24, 2011Updated 15 years ago
- Format and Complete Few-Shot LLM Prompts☆21Apr 22, 2026Updated last month
- The simple, fast, visual testing framework for web applications.☆13Nov 3, 2015Updated 10 years ago
- Code to package FiveThirtyEight data using Datasette☆16Mar 5, 2026Updated 2 months ago
- Transcribe audio using the Groq.com Whisper API☆21Nov 1, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Exploration of FEC contributions data with dplyr☆13Dec 5, 2013Updated 12 years ago
- A demo project and template repository showing how I use SpatiaLite with Datasette for quick spatial analysis.☆17Jul 7, 2024Updated last year
- A handy template for building a django prep sports site.☆14Jul 5, 2011Updated 14 years ago
- Interactive R tutorials for SPMC350 at the University of Nebraska-Lincoln's College of Journalism and Mass Communications☆15Apr 18, 2026Updated last month
- Investigative tool for extracting relevant areas from many documents☆14Nov 17, 2015Updated 10 years ago
- KubeCon + CloudNativeCon Seattle☆15Dec 30, 2018Updated 7 years ago
- Use LLMs to extract structured data from news articles. From the Star Tribune AI Lab.☆13Sep 15, 2025Updated 8 months ago
- Mapping the ATT&CK matrix in a Cowrie honeypot☆16Aug 31, 2018Updated 7 years ago
- ocr for historical data☆14Feb 23, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Code for extracting data from a large number of PDFs, particularly FCC political ad documents☆15Oct 26, 2017Updated 8 years ago
- CLI that queries multiple language models in parallel using prompts from a CSV file☆28Sep 24, 2025Updated 8 months ago
- Extract Stats Q/A from Tables With Provenance☆26Dec 27, 2025Updated 5 months ago
- Clip2Story is a prototype web application that transcribes news video clips, summarizes transcripts using OpenAI, and feeds summaries as …☆12May 1, 2024Updated 2 years ago
- Scrapy based crawler which crawls newspaper.☆20Mar 21, 2026Updated 2 months ago
- A collection of lists of forms maintained by local, state and federal policing organizations. If you have a form name, you have a FOIA re…☆18May 19, 2026Updated last week
- Tests for Mozilla's Support website.☆30Feb 26, 2016Updated 10 years ago
- Example of using Next.JS with JS-IPFS☆10May 1, 2025Updated last year
- ☆15Mar 11, 2024Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- A python package for analyzing the performances of cricketrs based on ESPN Cricinfo☆17May 7, 2020Updated 6 years ago
- A template project for using Flask, Semantic-UI, Flask-Assets (for an asset pipeline) and bower based dependency management☆12Jun 4, 2014Updated 11 years ago
- ☆16Jun 7, 2018Updated 7 years ago
- RSS news list contains a broad range of RSS news sources from around the globe formatted as JSON,XML,CSV or YML☆17Jul 29, 2019Updated 6 years ago
- AI agent for enhancing datasets with information from the internet☆21Nov 6, 2025Updated 6 months ago
- A Flask+Elasticsearch UI for exploring the DC Inbox dataset from http://web.stevens.edu/dcinbox/Home.html☆17Jan 21, 2022Updated 4 years ago
- Python scraper to get weekly CDC flu surveillance data☆25Dec 2, 2014Updated 11 years ago