Dockerizing an Apache Spark Standalone Cluster
☆42Jun 29, 2022Updated 3 years ago
Alternatives and similar repositories for apache-spark-docker
Users that are interested in apache-spark-docker are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Dockerizing and Consuming an Apache Livy environment☆13Jun 29, 2022Updated 3 years ago
- A parallel implementation of the bzip2 data compressor in python, this data compression pipeline is using algorithms like Burrows–Wheeler…☆14Jun 29, 2022Updated 3 years ago
- Term Frequency-Inverse Document Frequency from Scratch☆14Sep 19, 2021Updated 4 years ago
- Challenge Data Engineer☆25Jun 13, 2022Updated 4 years ago
- Build a Content-Based Movie Recommender System (TF-IDF, BM25, BERT)☆15Jun 13, 2022Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Docker Big Data Tools: This docker-compose file is configured to run multiple nodes. This is a Hadoop Cluster that contains the necessary…☆31Jul 6, 2021Updated 4 years ago
- The goal of this project is to track the expenses of Uber Rides and Uber Eats through data Engineering processes using technologies such …☆123Jun 29, 2022Updated 3 years ago
- Big Data infrastructure with Hadoop, Spark, Hive and NiFi deployed using Docker Compose. https://doi.org/10.5281/zenodo.18968438☆21Mar 11, 2026Updated 3 months ago
- AWS Lambda and CloudFormation code for loading CDC data from Relational databases to Amazon Kinesis using Database Migration Service.☆17Oct 14, 2020Updated 5 years ago
- ☆11Jul 13, 2020Updated 5 years ago
- Lecture: Big Data☆14Oct 27, 2025Updated 7 months ago
- ☆10Jun 3, 2023Updated 3 years ago
- MongoDB movie data model, ETL loader, and queries.☆15Nov 6, 2020Updated 5 years ago
- Base hadoop/spark/bigdata image with advanced config loading scripts.☆11Nov 3, 2020Updated 5 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆21Jul 3, 2019Updated 6 years ago
- Morphometric taxonomy of Central Europe☆40Apr 7, 2026Updated 2 months ago
- Zeppelin docker☆16Nov 16, 2020Updated 5 years ago
- This repo is an approach to TDD in machine learning model operation. it covers project structure, testing essentials using pytest with Gi…☆15Dec 2, 2020Updated 5 years ago
- ☆11Sep 29, 2014Updated 11 years ago
- Just a boilerplate for PySpark and Flask☆36Aug 2, 2018Updated 7 years ago
- A Python library to simplify batch requests to AWS Services☆12Apr 25, 2020Updated 6 years ago
- Rainfall is an extensible java framework to implement custom DSL based stress and performance tests☆12Mar 31, 2026Updated 2 months ago
- Dockerizing a Python Script for Web Scraping and consume the scraped data using FastApi (www.metroscubicos.com)☆15Dec 16, 2021Updated 4 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Repository for Apache Spark course at Team Data Science☆17Oct 23, 2020Updated 5 years ago
- An example of integration between angular, ionic, and require, inspired by directory-angular-ionic of Christophe Coenraets☆25Jun 6, 2016Updated 10 years ago
- Local Development of AWS Glue with Docker and Visual Studio Code☆14Nov 29, 2021Updated 4 years ago
- Solar Resource Assessment in Python☆12Feb 17, 2026Updated 3 months ago
- The solution is can help reduce AWS operational costs for both development and production environments.☆11Oct 1, 2017Updated 8 years ago
- Metabase Impala Driver☆11May 28, 2024Updated 2 years ago
- A collection of data analysis projects done using PySpark via Jupyter notebooks.☆10Oct 8, 2022Updated 3 years ago
- Group project for the WorldQuant University module, risk management.☆13Feb 3, 2019Updated 7 years ago
- Rasa Chatbot using Django backend and Sockets for communication☆12Dec 8, 2022Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆16Jun 3, 2026Updated last week
- ☆14Sep 14, 2021Updated 4 years ago
- Distributed stock price forecasting system to predict S&P 500 stock prices.☆11Nov 12, 2021Updated 4 years ago
- Power Plant ML Pipeline Application - Apache Spark☆12Dec 12, 2016Updated 9 years ago
- A docker image with a pre-configured Hive Metastore and a Spark ThriftServer☆19Jan 20, 2020Updated 6 years ago
- A simple CDK app written in Kotlin using Gradle DSL☆12Dec 28, 2018Updated 7 years ago
- Multi-container environment with Hadoop, Spark and Hive☆235May 5, 2025Updated last year