The source code for the book Modern Data Engineering with Apache Spark
☆41Jul 26, 2022Updated 3 years ago
Alternatives and similar repositories for spark-moderndataengineering
Users that are interested in spark-moderndataengineering are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Visits sessionization pipeline used for the talk☆13May 28, 2024Updated 2 years ago
- A series of workshop modules introducing Feast feature store.☆18May 31, 2022Updated 4 years ago
- Don't Panic. This guide will help you when it feels like the end of the world.☆31Feb 7, 2026Updated 4 months ago
- A platform and cloud-based service for data sharing based on the Delta Sharing protocol.☆21Jun 12, 2024Updated 2 years ago
- Model Context Protocol (MCP) server to interact with gRPC services using the grpcurl tool☆17Mar 5, 2025Updated last year
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- ☆18Nov 2, 2023Updated 2 years ago
- Code for Apache Hudi, Apache Iceberg and Delta Lake analysis☆10Feb 2, 2024Updated 2 years ago
- A Gentle introduction to Machine Learning with Apache Spark☆11Mar 2, 2026Updated 3 months ago
- BigData Course☆13Apr 7, 2026Updated 2 months ago
- ☆21Aug 31, 2025Updated 9 months ago
- Discover Bluemix, IBM Cloud Platform, through a set of hands-on labs.☆13Feb 13, 2024Updated 2 years ago
- Unity Catalog AI Model Context Protocol Server☆16Mar 28, 2025Updated last year
- Ejercicios prácticos (con diferente grado de dificultad) del gestor de bases de datos relacionales MySQL.☆10Feb 7, 2022Updated 4 years ago
- Covid19 and Iowa Liquor Sales analysis at BigQuery using dbt, Airflow, Marquez, Google Cloud and other modern data stack tools☆14Jun 18, 2022Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- BSR's new public API. Currently in development.☆21Jun 10, 2026Updated last week
- Code and examples of how to write and deploy Apache Spark Plugins. Spark plugins allow runnig custom code on the executors as they are in…☆96May 11, 2026Updated last month
- ☆16Oct 21, 2024Updated last year
- Code examples for functional programming☆21Mar 3, 2025Updated last year
- Big Data infrastructure with Hadoop, Spark, Hive and NiFi deployed using Docker Compose. https://doi.org/10.5281/zenodo.18968438☆21Mar 11, 2026Updated 3 months ago
- Profiling Spark Applications for Performance Comparison and Diagnosis☆16Nov 11, 2018Updated 7 years ago
- Material de apoyo para cursos, Facultad de Minas, Universidad Nacional de Colombia☆20Jun 9, 2026Updated last week
- ☆116Jan 15, 2025Updated last year
- Plugin for Intake to read from SQL servers☆15May 29, 2023Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Managing Data as a Product, published by Packt☆23Nov 30, 2024Updated last year
- A Flat Data GitHub Action demo repo☆15Jan 1, 2024Updated 2 years ago
- ISS Tracker for the Cardputer Adv☆47Jan 19, 2026Updated 4 months ago
- Write property based tests easily on spark dataframes☆21Jan 19, 2024Updated 2 years ago
- Generate Parquet Files☆14Apr 23, 2026Updated last month
- Trino (f.k.a PrestoSQL) dialect for SQLAlchemy.☆25May 5, 2022Updated 4 years ago
- Hands-on Labs (HOLs) and presentations for Microservices, Serverless and Containers readiness.☆13Dec 2, 2017Updated 8 years ago
- ☆13Feb 19, 2025Updated last year
- Examples of using Neo4j with R.☆22Jan 18, 2016Updated 10 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Official Dockerfile for Delta Lake☆63Feb 24, 2026Updated 3 months ago
- Statically analyze sources and extract information about called or exported library functions in Python applications☆21Apr 25, 2024Updated 2 years ago
- ☆14Dec 19, 2022Updated 3 years ago
- Curso de Hibernate y JPA☆17May 25, 2017Updated 9 years ago
- The CoreS3 is a new generation AIoT development platform released by M5Stack.☆19Sep 15, 2023Updated 2 years ago
- ☆11Oct 6, 2023Updated 2 years ago
- Helm Chart for deploying Spark history server in Amazon EKS for S3 Spark Event Logs☆29Apr 4, 2026Updated 2 months ago