Spark all the ETL Pipelines
☆37Aug 2, 2023Updated 2 years ago
Alternatives and similar repositories for SparkETL
Users that are interested in SparkETL are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆10Jun 3, 2023Updated 2 years ago
- ETL (Extract, Transform and Load) with the Spark Python API (PySpark) and Hadoop Distributed File System (HDFS)☆17Dec 18, 2018Updated 7 years ago
- My personal page, CV and blog☆15Updated this week
- Built a Data Pipeline for a Retail store using AWS services that collects data from its transactional database (OLTP) in Snowflake and tr…☆12May 25, 2023Updated 2 years ago
- Deploy A/B testing infrastructure in a containerized microservice architecture for Machine Learning applications.☆40Jan 10, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Identify and tokenize sensitive data automatically using Cloud DLP and Dataflow☆45Oct 27, 2025Updated 5 months ago
- In-browser data analysis using SQL | Powered by duckdb-wasm☆26Dec 21, 2025Updated 3 months ago
- This is the final project that after participated the Data Engineering Zoomcamp☆11Apr 4, 2022Updated 4 years ago
- Apache Spark Guide☆35Feb 1, 2022Updated 4 years ago
- ☆16Mar 9, 2026Updated last month
- Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.☆18May 30, 2024Updated last year
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆31Updated this week
- An operator for managing Alluxio system on Kubernetes cluster☆13Jan 9, 2024Updated 2 years ago
- Deploys a Lakehouse Architecture Solution☆62Feb 24, 2026Updated last month
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- My *nix dotfiles☆12Jul 4, 2025Updated 9 months ago
- Distributed System in Docker with Apache Kafka and Spark for big data streaming and visualisation (NodeJS, TypeScript, React, NestJS, Jav…☆24Apr 28, 2019Updated 6 years ago
- Resources for the Udemy Course - Azure Databricks & Spark Core For Data Engineers(Python/SQL) by Ramesh Retnasamy☆33Aug 23, 2024Updated last year
- Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.☆16Jan 4, 2026Updated 3 months ago
- Copy My Writing is a command-line tool for generating content based on your personal writing style.☆11Oct 12, 2025Updated 6 months ago
- JavaScript Style expression parser☆10Oct 13, 2020Updated 5 years ago
- Calico API☆24Updated this week
- Get map value via dot-delimited path or nil.☆30Sep 9, 2014Updated 11 years ago
- ☆12Aug 26, 2024Updated last year
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Zabbix Template (>2.4) and resources useful to monitor zfs on linux (zpool)☆13Jan 26, 2017Updated 9 years ago
- CLI secret management☆16Mar 19, 2026Updated 3 weeks ago
- Apache Polaris Tools, additional tooling for Apache Polaris☆26Updated this week
- A repository to store recipes, custom sources, transformations and other things to make your DataHub experience magical☆13Sep 23, 2022Updated 3 years ago
- OpenKruise Helm Charts.☆16Apr 4, 2026Updated last week
- Automated TPC-DS and TPC-H benchmark for Apache Hive LLAP☆10Jul 18, 2022Updated 3 years ago
- Source code for TPCx-BB benchmark for Hive and SparkSQL on scale factor of 300 GB☆10Jun 26, 2018Updated 7 years ago
- The Data Product Specification☆11Jan 28, 2025Updated last year
- Here I will be exploring various tools and methods that are used in data engineering process with Python.☆21Jan 4, 2021Updated 5 years ago
- NordVPN Threat Protection Pro™ • AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- Stardog Visual Studio Code Extensions☆17Updated this week
- Bigdata on Kubernetes, Published by Packt☆36Oct 1, 2024Updated last year
- Generate k8s diagrams of your cluster using D2☆43Mar 15, 2026Updated 3 weeks ago
- ☆10Jan 28, 2025Updated last year
- Open Data Stack Projects: Examples of End to End Data Engineering Projects☆90Jun 25, 2023Updated 2 years ago
- Run ansible-lint with reviewdog 🐕☆16Updated this week
- A lightweight Snowflake emulator built with Go and DuckDB for local development and testing☆31Jan 19, 2026Updated 2 months ago