ThoughtWorksInc / streaming-data-pipeline
Streaming pipeline repo for data engineering training program
☆9Updated 5 years ago
Alternatives and similar repositories for streaming-data-pipeline:
Users that are interested in streaming-data-pipeline are comparing it to the libraries listed below
- Sample Code for Thoughtful Data Science book☆15Updated 6 years ago
- AWS Big Data Certification☆25Updated last month
- A basic example of how to read and write streaming data using Apache Spark and Kafka on HDInsight☆13Updated last year
- Deploy an IMDB sentiment analysis model using kubernetes☆13Updated last year
- Terraform module for a PostgreSQL-backed Apache Airflow instance☆24Updated 6 years ago
- A cookiecutter template for Apache Spark applications written in Scala☆10Updated 6 years ago
- personal cheatsheets on various technologies☆25Updated 8 years ago
- Some class materials for a data processing course using PySpark☆52Updated 2 years ago
- A comparison of stream-processing frameworks with Kafka integration☆10Updated 6 years ago
- Onboarding to data science by ThoughtWorks☆55Updated 4 years ago
- Sandbox for Apache nifi☆24Updated 3 years ago
- Docker compose files for various kafka stacks☆32Updated 6 years ago
- Python Streaming Pipelines with Beam on Flink - Demo☆14Updated 2 years ago
- An extension for Jupyter notebooks that allows running notebooks inside a Docker container and converting them to runnable Docker images.☆28Updated last year
- A curated list of awesome Apache Spark packages and resources.☆40Updated 7 years ago
- DataHub on AWS demonstration resources☆10Updated 2 years ago
- A tutorial on Apache Spark Unit Testing☆37Updated 9 years ago
- Telecom scenarios implemented with streaming techniques☆11Updated last year
- A collection of datasets and databases☆24Updated 6 years ago
- Labs and data files for a full-day Spark workshop☆24Updated last year
- Examples and explanations of how RPC systems works.☆25Updated last year
- An example PySpark project with pytest☆17Updated 7 years ago
- AWS Lambda and Java version of Eventuate Todo list application☆26Updated 7 years ago
- Code that was used as an example during the Data+AI Summit 2020☆15Updated 3 years ago
- Datasets for CS109☆28Updated 11 years ago
- Real-world Spark pipelines examples☆83Updated 6 years ago
- Example Repository for Building Complex Data Pipeline with Luigi +TD☆24Updated 9 years ago
- Learning problem-solving, logic/set, math, physics, economics through functional programming using Haskell☆19Updated 9 years ago
- Study notes for "Big Data Analysis with Scala and Spark" on Coursera☆11Updated 7 years ago
- Example project which simulates an interesting analytics use case using MemSQL Pipelines.☆14Updated 7 years ago