kadnan / airflow-scraping
Using Apache Airflow to schedule web scrapers
☆42Updated 6 years ago
Alternatives and similar repositories for airflow-scraping:
Users that are interested in airflow-scraping are comparing it to the libraries listed below
- Blog post on ETL pipelines with Airflow☆23Updated 4 years ago
- Basic tutorial of using Apache Airflow☆36Updated 6 years ago
- Code to build a simple analytics data pipeline with Python☆102Updated 7 years ago
- 🐍💨 Airflow tutorial for PyCon 2019☆85Updated 2 years ago
- Schedule a data pipeline in Google Cloud using cloud function, BigQuery, cloud storage, cloud scheduler, stack trace, cloud build, and p…☆26Updated 5 years ago
- Data lake, data warehouse on GCP☆55Updated 3 years ago
- Just a boilerplate for PySpark and Flask☆35Updated 6 years ago
- Airflow training for the crunch conf☆104Updated 6 years ago
- Simple alert system implemented in Kafka and Python☆95Updated 6 years ago
- A tutorial on streaming data from a Flask REST API and streaming the response into PostgreSQL☆39Updated 5 years ago
- ☆110Updated last month
- ☆25Updated 6 years ago
- ETL with Python - Taught at DWH course 2017 (TAU)☆102Updated 7 years ago
- ☆46Updated 2 years ago
- Big Data Demystified meetup and blog examples☆31Updated 5 months ago
- How to use Python to understand data and transform the data into a tidy format ready to be used for modelling and visualisation.☆37Updated 5 years ago
- scaffold of Apache Airflow executing Docker containers☆85Updated 2 years ago
- Jupyter notebook for scraping and analysis of most in demand job technologies skills for data scientists.☆49Updated 5 years ago
- PyConDE & PyData Berlin 2019 Airflow Workshop: Airflow for machine learning pipelines.☆46Updated last year
- A code-based tutorial for production level data streaming with PySpark plus Optimus for data cleaning, Confluent Kafka, & Apache Drill u…☆26Updated 5 years ago
- Python client for the DSS public API☆42Updated this week
- (project & tutorial) dag pipeline tests + ci/cd setup☆86Updated 3 years ago
- Using Luigi to create a Machine Learning Pipeline using the Rossman Sales data from Kaggle☆33Updated 8 years ago
- ☆16Updated 7 years ago
- A modern ELT demo using airbyte, dbt, snowflake and dagster☆26Updated 2 years ago
- A curated list of awesome customer analytics content☆95Updated 7 years ago
- Airflow ETL for Meetup API☆46Updated 6 years ago
- Learn to build a data pipeline with Airflow to automate wrangling data - An Udacity Data Engineer Nano Degree Project☆8Updated 5 years ago
- A quick and easy way to convert a Pandas DataFrame to a Tableau .hyper or .tde extract.☆61Updated 5 years ago
- 🚨 Simple, self-contained fraud detection system built with Apache Kafka and Python☆84Updated 5 years ago