Spark all the ETL Pipelines
☆37Aug 2, 2023Updated 2 years ago
Alternatives and similar repositories for SparkETL
Users that are interested in SparkETL are comparing it to the libraries listed below
Sorting:
- ☆10Jun 3, 2023Updated 2 years ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆30Feb 18, 2026Updated last week
- Deploy A/B testing infrastructure in a containerized microservice architecture for Machine Learning applications.☆40Jan 10, 2025Updated last year
- ☆10Jan 28, 2025Updated last year
- Zabbix Template (>2.4) and resources useful to monitor zfs on linux (zpool)☆13Jan 26, 2017Updated 9 years ago
- A lightweight library for pre-processing images for pre-trained keras models☆14Aug 17, 2025Updated 6 months ago
- Presentar panorámica general de Spark y su aplicación para manipulación de big data mediante los lenguajes de Python y SQL. Se explicará …☆10Mar 10, 2022Updated 3 years ago
- The Data Product Specification☆11Jan 28, 2025Updated last year
- Learn Go with test-driven development'ın Türkçeye çevrilmesinin ilerlediği repodur.☆13Jan 5, 2024Updated 2 years ago
- Revisiting End-to-End Speech-to-Text Translation From Scratch☆13Feb 21, 2023Updated 3 years ago
- https://www.packtpub.com/books/info/authors/tomasz-lelek☆12Oct 30, 2021Updated 4 years ago
- Get map value via dot-delimited path or nil.☆30Sep 9, 2014Updated 11 years ago
- For my midterm project of the Machine Learning Zoomcamp, I decided to work in the Open Bioinformatics Research Project proposed by Data P…☆10Nov 2, 2021Updated 4 years ago
- A lightweight Snowflake emulator built with Go and DuckDB for local development and testing☆24Jan 19, 2026Updated last month
- Klogd2 is a new version of klogd, which is nothing but a simple program to stream Syslog messages to a Kafka server☆21Sep 8, 2012Updated 13 years ago
- DuckDB Copilot Extension☆10Jan 12, 2026Updated last month
- Complete Python-3 programming tutorials from beginner to advanced level.☆11Jul 28, 2019Updated 6 years ago
- A client for your Portus instance☆11Apr 17, 2019Updated 6 years ago
- This repo demonstrates an Apache Arrow Flight server implementation in Kubernetes.☆12Oct 25, 2024Updated last year
- Source code for http://allaboutscala.com/scala-cheatsheet/☆10Jun 12, 2018Updated 7 years ago
- A foreign data wrapper for PostgreSQL allowing easy accessing of Apache ORC formatted data files.☆11Sep 21, 2020Updated 5 years ago
- Find duplicate files on your computer☆21Jun 6, 2020Updated 5 years ago
- Built a Data Pipeline for a Retail store using AWS services that collects data from its transactional database (OLTP) in Snowflake and tr…☆11May 25, 2023Updated 2 years ago
- A Demo for real-time voice conversion based on Mel-GAN☆12Sep 13, 2021Updated 4 years ago
- Automated TPC-DS and TPC-H benchmark for Apache Hive LLAP☆10Jul 18, 2022Updated 3 years ago
- Streaming analytics project with eventsim and Kafka☆13Dec 23, 2022Updated 3 years ago
- Complete HomeServer Utilities☆15Mar 17, 2025Updated 11 months ago
- ☆13Updated this week
- ☆45Updated this week
- A compendium of data projects and associated blog posts☆10Nov 4, 2019Updated 6 years ago
- ☆14May 17, 2023Updated 2 years ago
- ☆12Dec 6, 2021Updated 4 years ago
- Clickstream Faker Provider for Python.☆11Apr 2, 2022Updated 3 years ago
- AI: Autoencoder for HTTP Log Anomaly Detection☆14Jan 20, 2019Updated 7 years ago
- Kubernetes operator for managing the lifecycle of Apache Spark applications on Kubernetes.☆18May 30, 2024Updated last year
- ☆51Updated this week
- ☆12May 27, 2025Updated 9 months ago
- COGNIZANCE - Machine Learning Course @ Coding Ninjas - January'18☆10Oct 6, 2018Updated 7 years ago
- Notebooks on Time series forecasting☆13Apr 19, 2019Updated 6 years ago