airscholar / RedditDataEngineering

This project provides a comprehensive data pipeline solution to extract, transform, and load (ETL) Reddit data into a Redshift data warehouse. The pipeline leverages a combination of tools and services including Apache Airflow, Celery, PostgreSQL, Amazon S3, AWS Glue, Amazon Athena, and Amazon Redshift.
108Updated last year

Alternatives and similar repositories for RedditDataEngineering:

Users that are interested in RedditDataEngineering are comparing it to the libraries listed below