jackghm / VerticaLinks
All things Vertica
☆62Updated 10 years ago
Alternatives and similar repositories for Vertica
Users that are interested in Vertica are comparing it to the libraries listed below
Sorting:
- Vertica Kit☆69Updated 10 years ago
- ☆24Updated 9 years ago
- User Defined Extensions (UDX) to the Vertica Analytic Database☆119Updated 2 years ago
- File compaction tool that runs on top of the Spark framework.☆59Updated 6 years ago
- Fork of Cloudera Impala separated from Hadoop☆42Updated 8 years ago
- kafka-connect-s3 : Ingest data from Kafka to Object Stores(s3)☆94Updated 6 years ago
- Spark SQL index for Parquet tables☆134Updated 4 years ago
- Ephemeral Hadoop clusters using Google Compute Platform☆136Updated 3 years ago
- Large scale query engine benchmark☆99Updated 9 years ago
- Bulletproof Apache Spark jobs with fast root cause analysis of failures.☆72Updated 4 years ago
- Spark package for checking data quality☆221Updated 5 years ago
- Complete Pipeline Training at Big Data Scala By the Bay☆71Updated 9 years ago
- Google Dataflow Runner for Apache Flink™ (deprecated; please use the up-to-date Beam Runner)☆88Updated 8 years ago
- Scripts for generating Grafana dashboards for monitoring Spark jobs☆242Updated 10 years ago
- An open-source, vendor-neutral data context service.☆159Updated 7 years ago
- A super simple utility for testing Apache Hive scripts locally for non-Java developers.☆72Updated 8 years ago
- An example of using Avro and Parquet in Spark SQL☆60Updated 9 years ago
- A slightly moist lipstick-on-pig clone for Apache Hive☆23Updated last year
- Hadoop output committers for S3☆109Updated 4 years ago
- Enabling Spark Optimization through Cross-stack Monitoring and Visualization☆47Updated 7 years ago
- Schedoscope is a scheduling framework for painfree agile development, testing, (re)loading, and monitoring of your datahub, lake, or what…☆96Updated 5 years ago
- Kafka sink connector for streaming messages to PostgreSQL☆91Updated 4 years ago
- Quark is a data virtualization engine over analytic databases.☆98Updated 7 years ago
- A Tez dev-setup for HDP2 sandbox☆21Updated 2 years ago
- Interactive Audience Analytics with Spark and HyperLogLog☆55Updated 9 years ago
- 4mc - splittable lz4 and zstd in hadoop/spark/flink☆109Updated 2 years ago
- Cantor provides utilities for estimating the cardinality of large sets.☆83Updated 3 years ago
- Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.☆88Updated last year
- Scripts for parsing / making sense of yarn logs☆52Updated 8 years ago
- The Schema Repo is a RESTful web service for storing and serving mappings between schema identifiers and schema definitions.☆156Updated 2 years ago