ThoughtWorksInc / streaming-data-pipeline
Streaming pipeline repo for data engineering training program
☆9Updated 4 years ago
Related projects: ⓘ
- AWS Big Data Certification☆24Updated last year
- Simple machine learning in Python/Tensorflow with model saving☆14Updated 7 years ago
- Pipelines Example Applications☆15Updated 4 years ago
- Deploy an IMDB sentiment analysis model using kubernetes☆13Updated last year
- Terraform module for a PostgreSQL-backed Apache Airflow instance☆24Updated 6 years ago
- An example PySpark project with pytest☆17Updated 6 years ago
- Deploy of Airflow 2.0 using ECS Fargate and AWS CDK.☆14Updated 2 years ago
- Pachyderm/MLeap team up to provide versioned datasets + models☆10Updated 7 years ago
- Spark package to "plug" holes in data using SQL based rules ⚡️ 🔌☆28Updated 4 years ago
- Datasets for CS109☆28Updated 10 years ago
- A cookiecutter template for Apache Spark applications written in Scala☆10Updated 5 years ago
- Debussy is an opinionated Data Architecture and Engineering framework, enabling data analysts and engineers to build better platforms and…☆28Updated last year
- Docker compose files for various kafka stacks☆33Updated 6 years ago
- Python Streaming Pipelines with Beam on Flink - Demo☆14Updated last year
- ☆23Updated 5 years ago
- Labs and data files for a full-day Spark workshop☆24Updated 11 months ago
- Generalized project for running Airflow DAGs, with possibility of skipping tasks already done for some set of input parameters.☆14Updated last year
- Docker-izing Data Science Applications CodeLab for QCon AI 2018☆13Updated 6 years ago
- An umbrella project for multiple implementations of model serving☆46Updated 7 years ago
- A curated list of awesome Databricks resources, including Spark☆14Updated 2 months ago
- Apache Airflow CI pipeline☆18Updated 5 years ago
- A place where I build Dockerfiles used in other projects. Built in CircleCI.☆19Updated 4 years ago
- A Soccer Dashboard created by scraping EPL website using Akka backend and ReactJS frontend and IBM Cloudant for object storage. IBM Cloud…☆20Updated 2 years ago
- Examples of all Machine Learning Algorithm in Apache Spark☆15Updated 6 years ago
- Sandbox for Apache nifi☆24Updated 2 years ago
- A boilerplate project for Azure Big Data PaaS services☆14Updated last year
- An ML project template with sensible defaults☆37Updated 2 years ago
- This repository contains a recipe for bootstrapping a climate analysis application using Apache Pinot and Superset☆20Updated 4 years ago
- Code repository for Fast Data Processing Systems with SMACK Stack by Packt☆18Updated last year
- A tutorial on Apache Spark Unit Testing☆37Updated 8 years ago