This repo contains a spark standalone cluster on docker for anyone who wants to play with PySpark by submitting their applications.
☆37Jun 9, 2023Updated 2 years ago
Alternatives and similar repositories for spark-standalone-cluster
Users that are interested in spark-standalone-cluster are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Kotlin extensions / Interfaces that extends the Java/Scala implementation/implicits of Smile NLP. Basically a simplification for Kotlin (…☆14Mar 31, 2020Updated 6 years ago
- This project contain build end-to-end e-commerce data from data source into data warehouse and visualization.☆13Sep 5, 2024Updated last year
- dbtVault + Greenplum demo☆11Feb 19, 2024Updated 2 years ago
- Fully dockerized Data Warehouse (DWH) using Airflow, dbt, PostgreSQL and dashboard using redash☆25Nov 12, 2022Updated 3 years ago
- Smart pet feeder using object detection and ESP32-cam☆24May 19, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- How to build an ACP compliant agent that uses MCP as well!☆11May 6, 2025Updated 11 months ago
- Data Analysis Experiments☆12Nov 2, 2017Updated 8 years ago
- Predicting the Stock Market - Can we do it?☆10Jul 24, 2021Updated 4 years ago
- Escrevi este roadmap para ajudar amigos próximos, está aberto a sugestões!☆14Sep 9, 2025Updated 7 months ago
- ☆20Aug 27, 2024Updated last year
- ☆19Mar 9, 2026Updated last month
- ☆15Feb 15, 2023Updated 3 years ago
- Scripts complement the Optimizing a Data Vault data warehouse on the Snowflake Cloud Data Platform webinar☆16Oct 8, 2020Updated 5 years ago
- It is a assemble to include all Practice Projects about Big Data Topic, includes Hadoop, Spark, Spark Streaming and Kafka☆11Mar 7, 2019Updated 7 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- A platform that helps developers to better understand CSS through declaration interpretation and may even improve them through suggestion…☆14Jul 3, 2021Updated 4 years ago
- Open episode of the data engineering practice course☆32Jul 2, 2024Updated last year
- ☆41Jul 4, 2022Updated 3 years ago
- 🎭 Natural language web automation with Puppeteer☆14Jun 16, 2024Updated last year
- Spark Notebook docker image☆10Dec 29, 2017Updated 8 years ago
- This repository contains an end-to-end data engineering project using Apache Flink, focused on performing sales analytics. The project de…☆11Nov 18, 2023Updated 2 years ago
- 🚀 A simple javascript template for rapid development of GitHub actions.☆17Feb 24, 2023Updated 3 years ago
- Create a streaming data, transfer it to Kafka, modify it with PySpark, take it to ElasticSearch and MinIO☆65Jul 21, 2023Updated 2 years ago
- ☆32Apr 8, 2026Updated last week
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Hadoop, Hive, Spark, Zeppelin and Livy: all in one Docker-compose file.☆171Feb 4, 2021Updated 5 years ago
- [abandoned] 🔥 DBF or Don't be Fired is a Visual Studio Code extension which send message boxes to remind you to avoid common mistakes☆14Jun 30, 2021Updated 4 years ago
- Adaptation postgres adapter for Greenplum☆36Mar 7, 2024Updated 2 years ago
- 🗃 Abre-te Código é um hackathon focado na expansão do acesso ao patrimônio cultural por meio do desenvolvimento de tecnologias a partir …☆11Oct 24, 2020Updated 5 years ago
- Apache Airflow advanced functionalities examples☆21Mar 22, 2024Updated 2 years ago
- Data warehouse tech stack with PostgreSQL, DBT and Airflow☆20Dec 29, 2025Updated 3 months ago
- An Ansible Role that manages installation and configuration of ClickHouse.☆21Aug 2, 2023Updated 2 years ago
- Project from the CTU Big Data course which purpose was to compute tf-idf values for the czech wikipedia☆10Jul 8, 2014Updated 11 years ago
- Withdraw All Your Linkedin Connect Invitation At Once With No Effort☆22May 13, 2020Updated 5 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Singer.io Target for Amazon Redshift - PipelineWise compatible☆12Sep 20, 2024Updated last year
- ☆11Jan 26, 2023Updated 3 years ago
- This repository has moved into https://github.com/dbt-labs/dbt-adapters☆51Feb 7, 2025Updated last year
- ☆12Oct 2, 2020Updated 5 years ago
- Simple, opinionated health checks Kubernetes-style health checks for ktor☆34Apr 19, 2022Updated 4 years ago
- A non-distributed reference implementation of Facebook's read-optimized graph data store, TAO☆12May 25, 2020Updated 5 years ago
- Criando Lambda Functions para Ingerir Dados de APIs com AWS CDK☆13Dec 1, 2021Updated 4 years ago