dttung2905 / flink-at-scale
π Tech blogs & talks by companies that run Apache Flink in production
β156Updated last week
Related projects β
Alternatives and complementary repositories for flink-at-scale
- Multi-hop declarative data pipelinesβ91Updated 2 weeks ago
- Flowchart for debugging Spark applicationsβ101Updated last month
- β252Updated 3 weeks ago
- β43Updated 3 months ago
- Spark-Dashboard is a solution for monitoring Apache Spark jobs. This repository provides the tooling and configuration for deploying an Aβ¦β111Updated this week
- The Internals of Delta Lakeβ182Updated last month
- A Python Library to support running data quality rules while the spark job is runningβ‘β163Updated last week
- A library that provides useful extensions to Apache Spark and PySpark.β196Updated 2 weeks ago
- Repo for everything open table formats (Iceberg, Hudi, Delta Lake) and the overall Lakehouse architectureβ30Updated 3 weeks ago
- Spark style guideβ256Updated last month
- The Internals of Spark on Kubernetesβ70Updated 2 years ago
- Code snippets used in demos recorded for the blog.β29Updated last month
- A simple Spark-powered ETL framework that just works πΊβ178Updated 11 months ago
- Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka in JSON or Avro format.β148Updated 2 weeks ago
- Delta Lake examplesβ207Updated last month
- Spark on Kubernetes using Helmβ34Updated 4 years ago
- β78Updated last year
- β163Updated this week
- Apache Flink Stateful Functions Playgroundβ129Updated last year
- A highly efficient daemon for streaming data from Kafka into Delta Lakeβ369Updated last week
- Performance Observability for Apache Sparkβ197Updated this week
- Adapter for dbt that executes dbt pipelines on Apache Flinkβ84Updated 8 months ago
- Avro SerDe for Apache Spark structured APIs.β231Updated 3 months ago
- Example for article Running Spark 3 with standalone Hive Metastore 3.0β96Updated last year