Basic framework utilities to quickly start writing production ready Apache Spark applications
☆36Dec 15, 2024Updated last year
Alternatives and similar repositories for spark-utils
Users that are interested in spark-utils are comparing it to the libraries listed below
Sorting:
- Apache Kafka Overview☆12Jun 9, 2023Updated 2 years ago
- Spark job for compacting avro files together☆12Jan 26, 2018Updated 8 years ago
- Zipkin tracing instrumentation for Akka☆10Dec 26, 2020Updated 5 years ago
- Transporter for integrating OpenLineage with OpenMetadata☆17Sep 10, 2025Updated 5 months ago
- The Hadoop ETL UDFs are the main way to load data from Hadoop into EXASOL☆16Oct 24, 2022Updated 3 years ago
- low-level helpers for Apache Spark libraries and tests☆16Dec 29, 2018Updated 7 years ago
- ☆24Oct 3, 2023Updated 2 years ago
- A tool to get better debug info on spark's memory usage☆42Aug 21, 2019Updated 6 years ago
- Examples of Spark 3.0☆45Nov 11, 2020Updated 5 years ago
- Big Data Processing Framework - Unified Data API or SQL on Any Storage☆251Jul 10, 2025Updated 7 months ago
- Apache Spark Interview Question and Answers☆21Oct 13, 2020Updated 5 years ago
- HDF masterclass materials☆29Mar 28, 2016Updated 9 years ago
- ☆43Feb 20, 2016Updated 10 years ago
- ☆10Jun 29, 2021Updated 4 years ago
- Spark-Radiant is Apache Spark Performance and Cost Optimizer☆25Dec 31, 2024Updated last year
- This repository has the code from the text and the videos for "Introduction to Programming and Problem Solving using Scala".☆30Feb 11, 2018Updated 8 years ago
- Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.☆92Mar 5, 2024Updated last year
- Instant search for and access to many datasets in Pyspark.☆34Oct 6, 2022Updated 3 years ago
- Jumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http:…☆72Jan 1, 2023Updated 3 years ago
- ☆32Mar 21, 2018Updated 7 years ago
- Wingolfsplattform. AK Internet des Wingolfsbundes.☆14Dec 31, 2022Updated 3 years ago
- GitHub Key Verification Helper☆14Jul 23, 2014Updated 11 years ago
- Affinity 시리즈의 한글화를 위하여.☆14Feb 9, 2026Updated 3 weeks ago
- A real-time data replication platform that "unbundles" the receiving, transforming, and transport of data streams.☆82Feb 16, 2024Updated 2 years ago
- This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spa…☆816Updated this week
- proxy that read from redis(or ssdb) write to both use for redis <=> ssdb migration on production☆12Aug 22, 2016Updated 9 years ago
- A clean online résumé (CV)☆13Jun 6, 2024Updated last year
- [Deprecated] Boxmeup is a web and mobile application to help users keep track of what they have in their containers and how to find items…☆16May 11, 2022Updated 3 years ago
- Script Execution service☆12Nov 21, 2016Updated 9 years ago
- Collect and aggregate on spark events for profitz☆10Apr 22, 2022Updated 3 years ago
- Apache Spark ETL Utilities☆39Oct 23, 2024Updated last year
- ☆12Feb 23, 2026Updated last week
- Codespace with Airflow and the Astro CLI☆11May 23, 2023Updated 2 years ago
- This is a list of YAML file examples for Docker, Kubernetes, Ansible. Also includes a Python script.☆10Jan 12, 2021Updated 5 years ago
- ☆10Sep 9, 2019Updated 6 years ago
- This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which…☆104Sep 26, 2025Updated 5 months ago
- POC for all the stack of big data (kafka, spark, cassandra, hdfs, docker, springboot)☆12Dec 16, 2022Updated 3 years ago
- 🗣 A command line tool that can generate English verbal descriptions for Scala source files or snippets.☆11Mar 5, 2018Updated 7 years ago
- Scala solutions for hackerrank☆11Nov 20, 2016Updated 9 years ago