Hadoop utility to compact small files
☆18Feb 16, 2026Updated 3 months ago
Alternatives and similar repositories for datasqueeze
Users that are interested in datasqueeze are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A service which allows Hive Metastore Listeners to be deployed outside of the Hive Metastore Service☆13Mar 26, 2026Updated last month
- Remedy small files by combining them into larger ones.☆195Jul 1, 2022Updated 3 years ago
- Insights Explorer is a tool to catalogue and present analytical & research work.☆13Nov 26, 2024Updated last year
- Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.☆93Mar 5, 2024Updated 2 years ago
- hdfs文件治理工具,文件批量解压、压缩、小文件合并☆25Feb 2, 2024Updated 2 years ago
- End-to-end encrypted cloud storage - Proton Drive • AdSpecial offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
- Demonstration of a Hive Input Format for Iceberg☆26Mar 12, 2021Updated 5 years ago
- Service for automatically managing and cleaning up unreferenced data☆50Apr 24, 2026Updated 3 weeks ago
- Mutation testing framework and code coverage for Hive SQL☆24May 11, 2021Updated 5 years ago
- Apiary provides modules which can be combined to create a federated cloud data lake☆37Apr 3, 2024Updated 2 years ago
- Oxia Java client SDK☆19Updated this week
- ☆14Oct 17, 2022Updated 3 years ago
- Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.☆286Feb 24, 2026Updated 2 months ago
- File compaction tool that runs on top of the Spark framework.☆59Apr 17, 2019Updated 7 years ago
- Pulsar consumer clients offering priority consumption☆12Mar 17, 2023Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Flume用户手册中文翻译版☆12Dec 4, 2023Updated 2 years ago
- Go template implementation in Java☆17Updated this week
- DBImport ingestion tool. Handle import, export and standard ETL flows in Hadoop/Hive☆19Feb 17, 2026Updated 3 months ago
- A lightweight message queue for Java that requires no dedicated queue server. Just a Redis server.☆37Sep 9, 2021Updated 4 years ago
- Arduino library for the Microchip MCP4261☆19Jan 6, 2022Updated 4 years ago
- ☆17May 5, 2018Updated 8 years ago
- QTag: Turbocharge Your SQL Comments☆12Jan 30, 2025Updated last year
- 项目中保留了向开源社区提交过的patch☆16Oct 22, 2017Updated 8 years ago
- A unit testing framework for the Cascading data processing platform.☆25Aug 25, 2021Updated 4 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Fast, reliable, and scalable channels implementation based on Redis streams.☆11Jun 25, 2024Updated last year
- A library to support building a coherent set of flink jobs☆17Oct 5, 2024Updated last year
- jdbc2 datasource suport DUPLICATE KEY incrment☆19Nov 25, 2020Updated 5 years ago
- An ORC File Scheme for the Cascading data processing platform.☆14Aug 26, 2021Updated 4 years ago
- A library for strong, schema based conversion between 'natural' JSON documents and Avro☆18Mar 5, 2024Updated 2 years ago
- ☆25Oct 18, 2021Updated 4 years ago
- ☆15Mar 31, 2026Updated last month
- Exposes Redis stream through the command line☆12Jun 28, 2022Updated 3 years ago
- Utility functions for dbt projects running on Trino☆22Dec 13, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- An exploration of Flink and change-data-capture via flink-cdc-connectors☆11Jul 7, 2021Updated 4 years ago
- Remedy small files by combining them into larger ones.☆23Oct 31, 2018Updated 7 years ago
- Code for the book - Practical Redis☆18Jan 5, 2019Updated 7 years ago
- NameNodeAnalytics is a self-help utility for scouting and maintaining the namespace of an HDFS instance.☆122Nov 25, 2025Updated 5 months ago
- Scala HTTP/SOCKS proxy library, based on akka-streams☆10Nov 3, 2018Updated 7 years ago
- phoenix☆12Oct 4, 2022Updated 3 years ago
- Hortonworks Data Platform Data Generation Tool☆13Nov 30, 2017Updated 8 years ago