spotify / crunch-libLinks
Useful reusable pipeline components for Crunch jobs
☆27Updated 10 years ago
Alternatives and similar repositories for crunch-lib
Users that are interested in crunch-lib are comparing it to the libraries listed below
Sorting:
- An Apache Storm IMetricsConsumer that forwards Storm's built-in metrics to a Graphite server for real-time graphing, visualization, and o…☆76Updated 2 years ago
- The Schema Repo is a RESTful web service for storing and serving mappings between schema identifiers and schema definitions.☆154Updated 3 years ago
- Delimited file loader for Cassandra☆198Updated 6 years ago
- Simple JVM Profiler Using StatsD and Other Metrics Backends☆333Updated 2 years ago
- Hadoop mapreduce job to bulk load data into Cassandra☆76Updated 3 years ago
- hRaven collects run time data and statistics from MapReduce jobs in an easily queryable format☆127Updated 3 years ago
- Hadoop output committers for S3☆111Updated 5 years ago
- kafka-connect-s3 : Ingest data from Kafka to Object Stores(s3)☆95Updated 6 years ago
- [PROJECT IS NO LONGER MAINTAINED] Wirbelsturm is a Vagrant and Puppet based tool to perform 1-click local and remote deployments, with a …☆329Updated 3 years ago
- Tools for parsing, creating and doing other fun stuff with sstables☆163Updated 8 years ago
- Low level integration of Spark and Kafka☆130Updated 7 years ago
- Metrics produced to Kafka and consumers for monitoring them☆101Updated 10 years ago
- Google Dataflow Runner for Apache Flink™ (deprecated; please use the up-to-date Beam Runner)☆88Updated 9 years ago
- metrics-datadog☆187Updated last year
- Cassandra schema migration tool for java☆99Updated 3 years ago
- Tools for reading data from Solr as a Spark RDD and indexing objects from Spark into Solr using SolrJ.☆445Updated last month
- Remedy small files by combining them into larger ones.☆194Updated 3 years ago
- production heap profiling for the JVM. compatible with google-perftools.☆396Updated 9 years ago
- Schedoscope is a scheduling framework for painfree agile development, testing, (re)loading, and monitoring of your datahub, lake, or what…☆96Updated 5 years ago
- Documentation tool for Avro schemas☆150Updated 5 years ago
- Unit test framework for hive and hive-service☆64Updated 3 years ago
- Storm Cassandra Integration☆181Updated last year
- Cassandra Dataset Manager☆32Updated 9 years ago
- Software to run automated repairs of cassandra☆235Updated 7 years ago
- ☆76Updated 10 years ago
- Generates more or less realistic log data for testing simple aggregation queries.☆261Updated last year
- Multidimensional data storage with rollups for numerical data☆267Updated last week
- Call Me Maybe: simulating network partitions in DBs☆20Updated 9 years ago
- reactive kafka client☆160Updated 5 years ago
- A reasonably complete implementation of the Universal Scalability Law model.☆202Updated 6 years ago