edwardcapriolo / filecrushView external linksLinks
Remedy small files by combining them into larger ones.
☆194Jul 1, 2022Updated 3 years ago
Alternatives and similar repositories for filecrush
Users that are interested in filecrush are comparing it to the libraries listed below
Sorting:
- Hadoop utility to compact small files☆18Mar 5, 2024Updated last year
- Jumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http:…☆72Jan 1, 2023Updated 3 years ago
- Remedy small files by combining them into larger ones.☆23Oct 31, 2018Updated 7 years ago
- Sample Python code for working with the HBase REST interface☆24Jul 25, 2013Updated 12 years ago
- SQL Windowing Functions for Hadoop☆65Jun 20, 2022Updated 3 years ago
- functionstest☆33Oct 25, 2016Updated 9 years ago
- NEW: see http://www.hops.io/. OLD: This work aims to re-engineer the Hadoop Distributed File System (HDFS) so that it can be 1) highly av…☆26Jan 2, 2012Updated 14 years ago
- An Ansible collection of utilities and other resources for Cloudera Platform deployments☆13Nov 13, 2025Updated 3 months ago
- A service which allows Hive Metastore Listeners to be deployed outside of the Hive Metastore Service☆13Jul 23, 2025Updated 6 months ago
- Cosine Similary Search in ElasticSearch + FAISS GPU☆12Mar 24, 2022Updated 3 years ago
- Few things we've met during our etl project based on spark☆24Mar 22, 2018Updated 7 years ago
- 项目中保留了向开源社区提交过的patch☆16Oct 22, 2017Updated 8 years ago
- Showing the relationship between ImageNet ID and labels and pytorch pre-trained model output ID and labels☆10Oct 11, 2020Updated 5 years ago
- A wrapper for Hadoop in Scala☆42Jul 18, 2010Updated 15 years ago
- A bunch of utility classes for Java, Hadoop, HBase, Pig, etc.☆76Mar 31, 2014Updated 11 years ago
- Spark UDFs to deserialize Avro messages with schemas stored in Schema Registry.☆20Jan 11, 2018Updated 8 years ago
- A set of examples and utilities for using Pig with Cassandra. For the latest jar release, check the Downloads link.☆84Aug 21, 2014Updated 11 years ago
- Pig on Apache Spark☆82Mar 23, 2015Updated 10 years ago
- Hadoop library for large-scale data processing, now an Apache Incubator project☆582Jul 8, 2014Updated 11 years ago
- an impala client for ruby☆34Jan 25, 2017Updated 9 years ago
- ☆16Nov 8, 2015Updated 10 years ago
- Tool for gathering blocks and replicas meta data from HDFS. It also builds a heat map showing how replicas are distributed along disks an…☆55May 9, 2017Updated 8 years ago
- Single view demo☆14Feb 13, 2016Updated 10 years ago
- Kafka, Spark Streaming, Kudu integration examples☆17Dec 22, 2017Updated 8 years ago
- An app built on Cloudera Enterprise for tracking metrics of jobs that run in YARN framework☆13Feb 5, 2016Updated 10 years ago
- Tools for building, packaging, and OAP public cloud integrations such as AWS EMR, Google Dataproc and K8S.☆18Mar 27, 2024Updated last year
- Ansible playbooks for deploying Hortonworks Data Platform☆128Dec 15, 2020Updated 5 years ago
- File compaction tool that runs on top of the Spark framework.☆59Apr 17, 2019Updated 6 years ago
- A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, orga…☆2,260Jan 15, 2026Updated 3 weeks ago
- Tools for spark which we use on the daily basis☆65Jul 2, 2020Updated 5 years ago
- https://github.com/apache/incubator-myriad is our new home. See☆253Dec 2, 2015Updated 10 years ago
- KDC for Cloudbreak provisioned Hadoop clusters☆15Aug 15, 2021Updated 4 years ago
- Collection of HDP Tuning Tricks & Tips (unofficial guide)☆17Sep 26, 2017Updated 8 years ago
- ☆16Oct 17, 2024Updated last year
- Luigi Workflow Engine integration for Treasure Data☆16May 14, 2018Updated 7 years ago
- ☆35Nov 18, 2020Updated 5 years ago
- Verify Hive SQL without running the sql exactly. Just check the syntax before run.☆24Oct 19, 2012Updated 13 years ago
- Apache Zeppelin Service for Apache Ambari Service. Installation and management of Zeppelin via Ambari.☆14Jan 23, 2016Updated 10 years ago
- ☆18Jan 17, 2025Updated last year