File compaction tool that runs on top of the Spark framework.
☆59Apr 17, 2019Updated 6 years ago
Alternatives and similar repositories for spark-compaction
Users that are interested in spark-compaction are comparing it to the libraries listed below
Sorting:
- Remedy small files by combining them into larger ones.☆23Oct 31, 2018Updated 7 years ago
- Integration of Iceberg table management into Spark SQL☆11Jan 21, 2020Updated 6 years ago
- Build configuration-driven ETL pipelines on Apache Spark☆161Oct 4, 2022Updated 3 years ago
- Atomix Jepsen tests☆14Feb 7, 2017Updated 9 years ago
- The Toolchain agnostic Runtime AI platform☆24Feb 7, 2023Updated 3 years ago
- Kafka Examples repository.☆44Feb 5, 2019Updated 7 years ago
- ☆26Dec 18, 2019Updated 6 years ago
- Enables automatic refactoring and linting of Maven projects written in Scala using Scalafix.☆26Mar 14, 2026Updated last week
- ☆15Jul 28, 2017Updated 8 years ago
- Liquibase extension to add Impala Database support☆24Mar 8, 2022Updated 4 years ago
- Scripts to demonstrate VPC Service Controls between tenant and shared projects☆12Jun 11, 2019Updated 6 years ago
- Hadoop InputFormat for http://druid.io/☆10Oct 26, 2016Updated 9 years ago
- Mirror of Apache Beam☆10Jan 27, 2021Updated 5 years ago
- The Schema Repo is a RESTful web service for storing and serving mappings between schema identifiers and schema definitions.☆154Jul 7, 2022Updated 3 years ago
- Instructions for setting up Kerberos, Zookeeper, and Kafka with SASL☆16Jan 22, 2018Updated 8 years ago
- A dynamic data completeness and accuracy library at enterprise scale for Apache Spark☆29Nov 4, 2024Updated last year
- A little crawler/sdk for retrieve in real time information about transport in Paris☆18Oct 24, 2015Updated 10 years ago
- ☆12Apr 7, 2025Updated 11 months ago
- [student project] UI to run SQL on Delta Lake tables and visualize the variations of the result among tables versions☆12Apr 21, 2020Updated 5 years ago
- Multi-stage, config driven, SQL based ETL framework using PySpark☆26Sep 16, 2019Updated 6 years ago
- Demo for making use of RATP's real-time API☆13May 3, 2017Updated 8 years ago
- Simple Spark example of generating table stats for use of data quality checks☆28Apr 28, 2017Updated 8 years ago
- Deploying a simple, customized Flask API in python via Google App Engine☆13Aug 20, 2017Updated 8 years ago
- ☆11Aug 14, 2014Updated 11 years ago
- Camus Compressor merges files created by Camus and saves them in a compressed format.☆13Mar 20, 2023Updated 3 years ago
- An app built on Cloudera Enterprise for tracking metrics of jobs that run in YARN framework☆13Feb 5, 2016Updated 10 years ago
- Opinionated CNCF-based, Docker Compose setup for everything needed to develop a 12factor app☆18Feb 23, 2022Updated 4 years ago
- Examples for ETL Integrations with Adobe Experience Platform☆14Aug 16, 2024Updated last year
- A lightweight mapping framework that maps data objects to a number of nodes, subject to constraints☆93Mar 16, 2017Updated 9 years ago
- Delta Lake Examples☆11Apr 24, 2020Updated 5 years ago
- General utility code used across BDG products. Apache 2 licensed.☆18May 6, 2025Updated 10 months ago
- This is an example of real time stream processing using Spark Streaming, Kafka & Elasticsearch.☆40Aug 31, 2016Updated 9 years ago
- Kafka to Avro Writer based on Apache Beam. It's a generic solution that reads data from multiple kafka topics and stores it on in cloud s…☆25Apr 7, 2021Updated 4 years ago
- Java implementation of the SCRAM SASL for both server and client plus examples☆17Apr 18, 2021Updated 4 years ago
- APN Designations template folder structure and presentation, including APN Competency Program and APN Service Delivery Program☆22Feb 11, 2025Updated last year
- This library enables to use ZooKeeper as cluster coordinator in a ConstructR based cluster☆12Dec 2, 2017Updated 8 years ago
- ☆243Jun 14, 2018Updated 7 years ago
- Remedy small files by combining them into larger ones.☆195Jul 1, 2022Updated 3 years ago
- An Apache Spark app for making data movement between Apache Hive and Apache Phoenix/HBase☆14Mar 23, 2016Updated 9 years ago