☆40Jan 24, 2023Updated 3 years ago
Alternatives and similar repositories for docker-spark-airflow
Users that are interested in docker-spark-airflow are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Docker with Airflow + Postgres + Spark cluster + JDK (spark-submit support) + Jupyter Notebooks☆24Apr 2, 2022Updated 4 years ago
- Docker with Airflow and Spark standalone cluster☆264Aug 5, 2023Updated 2 years ago
- Building a Modern Data Lake with Minio, Spark, Airflow via Docker.☆23May 11, 2024Updated 2 years ago
- Delta-Lake, ETL, Spark, Airflow☆49Oct 9, 2022Updated 3 years ago
- Project repository of Apache Airflow, deployed on Docker in Amazon EC2 via GitLab.☆15Sep 3, 2021Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆12Mar 17, 2022Updated 4 years ago
- Demo of using Airflow☆11Jun 24, 2022Updated 3 years ago
- Dockerizing and Consuming an Apache Livy environment☆13Jun 29, 2022Updated 3 years ago
- Data Guy Story commandline☆11Dec 2, 2022Updated 3 years ago
- The practical use-cases of how to make your Machine Learning Pipelines robust and reliable using Apache Airflow.☆52Dec 8, 2022Updated 3 years ago
- ☆22Mar 21, 2023Updated 3 years ago
- An example of triples extraction with PoS-tags using ReVerb☆17May 23, 2017Updated 9 years ago
- ☆12Mar 12, 2021Updated 5 years ago
- A workspace to experiment with Apache Spark, Livy, and Airflow in a Docker environment.☆38Mar 29, 2021Updated 5 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- This is a boilerplate which has dependencies for pyspark(3.3.0) mongo(>4.x) connectivity☆10May 3, 2024Updated 2 years ago
- Handle linguistic corpus and convert it to use NLP tools☆21Jul 5, 2013Updated 12 years ago
- An attempt to use natural language processing techniques in order to aid stock price forecasts.☆15Oct 4, 2017Updated 8 years ago
- This project contain build end-to-end e-commerce data from data source into data warehouse and visualization.☆13Sep 5, 2024Updated last year
- ☆10Feb 19, 2022Updated 4 years ago
- Code for blog at https://www.startdataengineering.com/post/python-for-de/☆106May 26, 2026Updated 2 weeks ago
- Detailed notes and homeworks from 2025 Data Engineering Zoomcamp by Datatalks.Club☆56Mar 10, 2025Updated last year
- Autocomplete / Autofill Text field with Dropdown menu to choose between suggested values from a given list.☆14Feb 23, 2024Updated 2 years ago
- Generate cloud-init ready vm images via packer and deploy these via terraform.☆16Jan 6, 2026Updated 5 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A Python package to help Databricks Unity Catalog users to read and query Delta Lake tables with Polars, DuckDb, or PyArrow.☆27Mar 25, 2024Updated 2 years ago
- Typings for Confluent Kafka Python Client☆27Apr 11, 2026Updated last month
- ☆19Feb 25, 2022Updated 4 years ago
- ☆17Apr 1, 2025Updated last year
- A boilerplate for authoring npm modules, with tests and linting.☆10Jun 8, 2017Updated 9 years ago
- ☆25Mar 15, 2024Updated 2 years ago
- Example of how to leverage Apache Spark distributed capabilities to call REST-API using a UDF☆51Oct 11, 2022Updated 3 years ago
- A Python package extending pandas with helper functions for simpler exploratory data analysis and data wrangling.☆10Feb 6, 2025Updated last year
- A framework of open-source technologies to design real-time machine learning systems☆30Mar 6, 2023Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Integrating Apache Airflow, dbt, Great Expectations and Apache Superset to develop a modern open source data stack.☆18Jun 19, 2022Updated 3 years ago
- Data Analysis and Image Processing Python Course☆12Nov 4, 2014Updated 11 years ago
- ☆10Jun 2, 2025Updated last year
- Generate and Compare Debezium CDC (Chance Data Capture) Avro Schema, directly from your Database.☆27Updated this week
- Dask on ECS Fargate☆14Sep 23, 2019Updated 6 years ago
- Selenium Grid in ECS using Fargate Spot Containers☆14Feb 1, 2023Updated 3 years ago
- Deploy a Machine Learning Model as an API on AWS☆15Jun 21, 2022Updated 3 years ago