A ready to go Big Data cluster (Hadoop + Hadoop Streaming + Spark + PySpark) with Docker and Docker Swarm!
☆23May 20, 2025Updated 9 months ago
Alternatives and similar repositories for docker-big-data-cluster
Users that are interested in docker-big-data-cluster are comparing it to the libraries listed below
Sorting:
- Run Hadoop Cluster within Docker Containers.☆16Mar 6, 2025Updated 11 months ago
- Docker multi-nodes Hadoop cluster with Spark 2.4.1 on Yarn☆51Dec 7, 2020Updated 5 years ago
- 数据库实训平台(前端项目)☆15Feb 27, 2023Updated 3 years ago
- Mouse replacement software to use computers with your eyes with support of a compatible Tobii Eye Tracker.☆10Jul 5, 2020Updated 5 years ago
- Simple PCB for Wemos D1 Mini ESP8266 and an A4988 stepper driver☆11Sep 30, 2023Updated 2 years ago
- sql engine for csv files☆16Nov 3, 2016Updated 9 years ago
- A work-in-progress book on Dask☆12Jul 15, 2023Updated 2 years ago
- Udacity Data Engineering Nanodegree Project 3☆12Jul 14, 2019Updated 6 years ago
- Rust HAL repp☆12Apr 25, 2022Updated 3 years ago
- Combination of Dockerized Hortonworks projects and other Hadoop ecosystem components☆10Oct 11, 2019Updated 6 years ago
- 将P站下载的动图压缩包转换为Gif图。☆10Nov 25, 2017Updated 8 years ago
- 王道考研 操作系统 教学视频:https://www.bilibili.com/video/BV1YE411D7nH☆11Mar 20, 2021Updated 4 years ago
- Hadoop, Hive, Spark, Zeppelin and Livy: all in one Docker-compose file.☆171Feb 4, 2021Updated 5 years ago
- Fastly scans open ports on large networks and bruteforce login mechanism of found services☆12Aug 24, 2019Updated 6 years ago
- Lecture: Big Data☆14Oct 27, 2025Updated 4 months ago
- Crawlyx is an open-source command-line interface (CLI) based web crawler built using Node.js. It is designed to crawl websites and extrac…☆13Apr 12, 2025Updated 10 months ago
- PDF, CDF, and percent-point/quantile functions for the normal and Student’s t distributions☆12Jan 7, 2026Updated last month
- ☆21Jul 15, 2015Updated 10 years ago
- Various text analytics tutorials☆13May 16, 2017Updated 8 years ago
- Document parameters using comments☆10Aug 6, 2021Updated 4 years ago
- Kubernetes stack deployment with Microk8s and ArgoCD☆15Updated this week
- Assessing Disparate Impacts of Personalized Interventions: Identifiability and Bounds☆11Oct 28, 2019Updated 6 years ago
- OCR as a service☆15Dec 11, 2016Updated 9 years ago
- JSON Serde for Hive☆21Oct 13, 2011Updated 14 years ago
- Provide functionality to build statistical models to repair dirty tabular data in Spark☆12Apr 21, 2023Updated 2 years ago
- Creation of a data lakehouse and an ELT pipeline to enable the efficient analysis and use of data☆49Dec 2, 2023Updated 2 years ago
- plan, design and implement enterprise data infrastructure solutions and create the blueprints for an organization’s data management syste…☆14Jun 25, 2023Updated 2 years ago
- ✨ Popular linux distributions configured with systemd, sshd and ttyd ✨☆13Oct 19, 2023Updated 2 years ago
- ☆10Feb 28, 2020Updated 6 years ago
- DEPRECATED repo for Manning book Deep Learning with Structured Data - please see https://github.com/ryanmark1867/deep_learning_for_struct…☆12May 17, 2020Updated 5 years ago
- A low code editor with the full power of flutter. created by @sanihaq for @flutter🌸☆11Aug 1, 2021Updated 4 years ago
- A group of examples based on the CSE pipleline.☆10May 13, 2013Updated 12 years ago
- Repository for building Apache Ozone Docker images☆20Feb 10, 2026Updated 3 weeks ago
- ☆12Apr 18, 2024Updated last year
- An NTLM, NTLM2SR, and NTLMv2 authenticating HTTP proxy☆10Dec 5, 2015Updated 10 years ago
- Machine Learning Quick Reference, published by Packt☆17Jan 30, 2023Updated 3 years ago
- streaming data pipeline platform☆29Jan 4, 2026Updated last month
- Scala embedded universal probabilistic programming language☆11Apr 15, 2021Updated 4 years ago
- The OpenJur is an administrative Open Source system for lawyers and law firms of any size.☆15Jun 2, 2016Updated 9 years ago