Examples for High Performance Spark
☆16Oct 25, 2025Updated 4 months ago
Alternatives and similar repositories for high-performance-spark-examples
Users that are interested in high-performance-spark-examples are comparing it to the libraries listed below
Sorting:
- Spark and Delta Lake Workshop☆22Jun 14, 2022Updated 3 years ago
- ☆13Jan 6, 2026Updated last month
- ☆10Jun 29, 2021Updated 4 years ago
- Magic to help Spark pipelines upgrade☆34Sep 29, 2024Updated last year
- Data pipeline example written in Rust with Polars and DataFusion DataFrame package☆41Mar 12, 2023Updated 2 years ago
- DBT and clickhouse test project with dagster☆12Aug 29, 2023Updated 2 years ago
- ☆11Aug 14, 2014Updated 11 years ago
- Code to help generate SQL for stakeholders. Code at https://www.startdataengineering.com/post/data-democratize-llm/☆13May 24, 2024Updated last year
- This project sets up a real-time data pipeline utilizing Change Data Capture (CDC) to stream changes from a PostgreSQL database to a Clic…☆12May 9, 2024Updated last year
- A stateful distributed balance service with persistent entities via sharding in persistence mode☆13Oct 26, 2018Updated 7 years ago
- Local Development of AWS Glue with Docker and Visual Studio Code☆14Nov 29, 2021Updated 4 years ago
- This construct builds some elements for you to quickly launch an EMR Serverless application. After submitting the Emr Serverless job, you…☆11Nov 18, 2025Updated 3 months ago
- Fulfills a GitHub workflow_job webhooks into a Pub/Sub queue.☆12Mar 13, 2025Updated 11 months ago
- A command-line tool that summarizes the size of a codebase by language, showing lines of code with and without comments and blank lines.☆47Updated this week
- Building a poor man's data lake: Exploring the Power of Polars and Delta Lake☆11Dec 6, 2025Updated 3 months ago
- Data Observability for Data Engineering, published by Packt Publishing☆11Jan 24, 2025Updated last year
- A Kafka metric sink for Apache Spark☆11Apr 13, 2017Updated 8 years ago
- Sample Unity game assets and scripts☆13Feb 23, 2015Updated 11 years ago
- Rust And Delta Demo. Explanation and walkthrough on delta-rs☆10Aug 21, 2023Updated 2 years ago
- CDK Demo implementing an S3 Object custom resource using AWSCustomResource☆10Jun 1, 2020Updated 5 years ago
- dbt-databend adapter plugin☆10May 30, 2024Updated last year
- KPI Indicator for Power BI☆10Dec 30, 2019Updated 6 years ago
- End-to-End ELT data pipeline with Postgres, Airbyte, dbt, Dagster, Snowflake and Metabase☆11Jul 13, 2023Updated 2 years ago
- This is the LinkedIn Learning repository for Level Up: Python Data Acquisitions, Prep, & EDA.☆15Mar 4, 2025Updated last year
- Flowchart for debugging Spark applications☆106Sep 25, 2024Updated last year
- Structured Streaming Machine Learning example with Spark 2.0☆94Apr 24, 2017Updated 8 years ago
- Scrapping made easy...☆15Sep 3, 2016Updated 9 years ago
- Repository for Data Engineering Zoomcamp 2024☆14Mar 25, 2024Updated last year
- A PDM plugin to sync the exported files with the project file☆15Sep 6, 2025Updated 5 months ago
- Code for Apache Hudi, Apache Iceberg and Delta Lake analysis☆10Feb 2, 2024Updated 2 years ago
- Spark-cloud is a set of scripts for starting spark clusters on ec2☆12Dec 21, 2015Updated 10 years ago
- Secure shell command execution MCP server for Claude AI. Enables controlled shell access within specified directories.☆17Aug 19, 2025Updated 6 months ago
- Contains example dags and terraform code to create a composer with a node pool to run pods