This repo contains a spark standalone cluster on docker for anyone who wants to play with PySpark by submitting their applications.
☆37Jun 9, 2023Updated 2 years ago
Alternatives and similar repositories for spark-standalone-cluster
Users that are interested in spark-standalone-cluster are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Learn Apache Spark in Scala, Python (PySpark) and R (SparkR) by building your own cluster with a JupyterLab interface on Docker.☆507Nov 7, 2025Updated 6 months ago
- This project contain build end-to-end e-commerce data from data source into data warehouse and visualization.☆13Sep 5, 2024Updated last year
- ☆24Dec 21, 2020Updated 5 years ago
- Smart pet feeder using object detection and ESP32-cam☆26May 19, 2024Updated last year
- Code & Items for the RAG Series tutorials posted to Medium☆31Jan 26, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆22Mar 9, 2026Updated 2 months ago
- Data Vault 2.0: Code generation, Vertica, Airflow☆13Nov 20, 2019Updated 6 years ago
- ☆20Mar 9, 2026Updated 2 months ago
- ☆21Oct 9, 2025Updated 6 months ago
- ☆15Feb 15, 2023Updated 3 years ago
- Scripts complement the Optimizing a Data Vault data warehouse on the Snowflake Cloud Data Platform webinar☆16Oct 8, 2020Updated 5 years ago
- 100 Days of ML Coding Challenge☆15Mar 21, 2019Updated 7 years ago
- End-to-end data engineer project☆24Aug 17, 2023Updated 2 years ago
- It is a assemble to include all Practice Projects about Big Data Topic, includes Hadoop, Spark, Spark Streaming and Kafka☆11Mar 7, 2019Updated 7 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- A platform that helps developers to better understand CSS through declaration interpretation and may even improve them through suggestion…☆14Jul 3, 2021Updated 4 years ago
- Open episode of the data engineering practice course☆32Jul 2, 2024Updated last year
- ☆41Jul 4, 2022Updated 3 years ago
- Spark Notebook docker image☆10Dec 29, 2017Updated 8 years ago
- Indonesian BERT Fine Tuning News Classification☆15Jun 8, 2020Updated 5 years ago
- Python CLI tool to reformat a text into a specified number of words per line/characters per line.☆25Jan 25, 2016Updated 10 years ago
- 🚀 A simple javascript template for rapid development of GitHub actions.☆17Feb 24, 2023Updated 3 years ago
- Create a streaming data, transfer it to Kafka, modify it with PySpark, take it to ElasticSearch and MinIO☆65Jul 21, 2023Updated 2 years ago
- Hadoop, Hive, Spark, Zeppelin and Livy: all in one Docker-compose file.☆170Feb 4, 2021Updated 5 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Hadoop-Hive-Spark cluster + Jupyter on Docker☆85Jan 2, 2025Updated last year
- Fully reproducible, Dockerized, step-by-step, tutorial on how to mock a "real-time" Kafka data stream from a timestamped csv file. Detai…☆40Nov 15, 2021Updated 4 years ago
- ☆11Dec 13, 2022Updated 3 years ago
- Trino dbt demo project to mix and load BigQuery data with and in a local PostgreSQL database☆76Sep 8, 2021Updated 4 years ago
- Data warehouse tech stack with PostgreSQL, DBT and Airflow☆20Dec 29, 2025Updated 4 months ago
- Project from the CTU Big Data course which purpose was to compute tf-idf values for the czech wikipedia☆10Jul 8, 2014Updated 11 years ago
- A repository of blogs/videos that presents how Apache Iceberg is being used in Production by various orgs☆20Jul 31, 2023Updated 2 years ago
- ☆12Oct 2, 2020Updated 5 years ago
- A non-distributed reference implementation of Facebook's read-optimized graph data store, TAO☆12May 25, 2020Updated 5 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Simple and secure newline delimited JSON stream parser☆24Feb 1, 2025Updated last year
- Criando Lambda Functions para Ingerir Dados de APIs com AWS CDK☆13Dec 1, 2021Updated 4 years ago
- A variational auto-encoder (VAE) framework with a new type of prior "Variational Mixture of Posteriors" prior, or VampPrior for short.☆10Apr 7, 2021Updated 5 years ago
- The open-source Useful SDK. One python decorator in the Useful library allows for full observability of Python functions within an ETL.☆19Jan 11, 2024Updated 2 years ago
- Simple Chrome Extension for a quick launcher to your all social media profiles at one click.☆16Apr 30, 2020Updated 6 years ago
- ☆23Feb 5, 2024Updated 2 years ago
- A Python library to generate static data catalog sites. Carte scrapes metadata from your data assets and generates a fully searchable fro…☆29Jul 7, 2022Updated 3 years ago