JawaharRamis / reddit-streaming-kafka-spark-applicationLinks

☆9

Alternatives and similar repositories for reddit-streaming-kafka-spark-application

Users that are interested in reddit-streaming-kafka-spark-application are comparing it to the libraries listed below

Sorting:

josephmachado / simple_dbt_project
Code for dbt tutorial
☆156Updated last month
dominikhei / Local-Data-LakeHouse
Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testin…
☆73Updated last year
josephmachado / data_engineering_best_practices
Sample project to demonstrate data engineering best practices
☆194Updated last year
josephmachado / online_store
End to end data engineering project
☆57Updated 2 years ago
josephmachado / data_engineering_project_template
A template repository to create a data project with IAC, CI/CD, Data migrations, & testing
☆268Updated last year
kaoutaar / end-to-end-etl-pipeline-jcdecaux-API
velib-v2: An ETL pipeline that employs batch and streaming jobs using Spark, Kafka, Airflow, and other tools, all orchestrated with Docke…
☆20Updated 10 months ago
josephmachado / beginner_de_project_stream
Simple stream processing pipeline
☆103Updated last year
josephmachado / adv_data_transformation_in_sql
Code for "Advanced data transformations in SQL" free live workshop
☆82Updated 2 months ago
viirya / eventsim
Event data simulator. Generates a stream of pseudo-random events from a set of users, designed to simulate web traffic.
☆89Updated last year
TJaniF / airflow-elt-blueprint
A self-contained, ready to run Airflow ELT project. Can be run locally or within codespaces.
☆74Updated last year
Data-Engineer-Camp / dbt-dimensional-modelling
Step-by-step tutorial on building a Kimball dimensional model with dbt
☆143Updated last year
josephmachado / socialetl
Project for "Data pipeline design patterns" blog.
☆45Updated 11 months ago
bartosz25 / data-engineering-design-patterns-book
Code snippets for Data Engineering Design Patterns book
☆128Updated 3 months ago
Zzdragon66 / university-reddit-data-dashboard
☆31Updated last year
josephmachado / efficient_data_processing_spark
Code for "Efficient Data Processing in Spark" Course
☆323Updated last month
josephmachado / bitcoinMonitor
Near real time ETL to populate a dashboard.
☆72Updated last year
Amrit-Hub / Databricks-Certified-Data-Engineer-Professional-Questions
This repo contains "Databricks Certified Data Engineer Professional" Questions and related docs.
☆85Updated 11 months ago
danielbeach / dataEngineeringTemplate
Template for Data Engineering and Data Pipeline projects
☆112Updated 2 years ago
astronomer / airflow-dbt-demo
A repository of sample code to accompany our blog post on Airflow and dbt.
☆174Updated last year
josephmachado / analytical_dp_with_sql
Code for my "Efficient Data Processing in SQL" book.
☆57Updated 11 months ago
Armaan1Gohil / dataengineering-tech-stack
Local Environment to Practice Data Engineering
☆143Updated 6 months ago
konosp / dbt-airflow-docker-compose
Execution of DBT models using Apache Airflow through Docker Compose
☆117Updated 2 years ago
josephmachado / de_project
Step by step instructions to create a production-ready data pipeline
☆54Updated 6 months ago
alanchn31 / Movalytics-Data-Warehouse
Data pipeline performing ETL to AWS Redshift using Spark, orchestrated with Apache Airflow
☆147Updated 5 years ago
developer-advocacy-dremio / definitive-guide-to-apache-iceberg
☆91Updated 6 months ago
HamzaG737 / data-engineering-project
End to end data engineering project with kafka, airflow, spark, postgres and docker.
☆98Updated 3 months ago
Snowflake-Labs / sfguide-data-engineering-with-snowpark-python
☆134Updated 5 months ago
sidharth1805 / Spotify_etl
☆142Updated 2 years ago
borjavb / dbt-iceberg-poc
☆80Updated 9 months ago
jacob1421 / RustCheatersDataPipeline
Data pipeline that scrapes Rust cheater Steam profiles
☆52Updated 3 years ago