ExpediaGroup / corcLinks
An ORC File Scheme for the Cascading data processing platform.
☆14Updated 4 years ago
Alternatives and similar repositories for corc
Users that are interested in corc are comparing it to the libraries listed below
Sorting:
- Collection of utilities to allow writing java code that operates across a wide range of avro versions.☆86Updated 2 months ago
- Measure behavior of Java applications☆42Updated 4 years ago
- Fast Apache Avro serialization/deserialization library☆46Updated 3 weeks ago
- Hadoop output committers for S3☆113Updated 5 years ago
- A library for strong, schema based conversion between 'natural' JSON documents and Avro☆18Updated last year
- Profiler for large-scale distributed java applications (Spark, Scalding, MapReduce, Hive,...) on YARN.☆128Updated 7 years ago
- High performance native memory access for Java.☆128Updated 8 months ago
- DDSketch: A Fast and Fully-Mergeable Quantile Sketch with Relative-Error Guarantees.☆127Updated 4 months ago
- A unit testing framework for the Cascading data processing platform.☆25Updated 4 years ago
- ☆13Updated 7 years ago
- The Schema Repo is a RESTful web service for storing and serving mappings between schema identifiers and schema definitions.☆154Updated 3 years ago
- Cache File System optimized for columnar formats and object stores☆187Updated 3 years ago
- ☆175Updated 4 years ago
- Druid indexing plugin for using Spark in batch jobs☆101Updated 4 years ago
- Port of TPC-H dbgen to Java☆52Updated last year
- Quark is a data virtualization engine over analytic databases.☆100Updated 8 years ago
- A Scalable Concurrent Key-Value Map for Big Data Analytics☆274Updated 2 years ago
- Java library for efficiently working with heap and off-heap memory☆514Updated this week
- A JVMTI agent that attaches to your JVM and kills it when things go sideways☆177Updated last year
- HyperLogLog (original and hyperloglog++) algorithm implementation in java.☆81Updated 4 years ago
- Large off-heap arrays and mmap files for Scala and Java☆405Updated 3 years ago
- Big Data Toolkit for the JVM☆146Updated 5 years ago
- ☆241Updated 4 years ago
- An In-Memory Cache Backed by Apache Kafka☆255Updated last month
- Java library for the HyperLogLog algorithm☆316Updated 8 years ago
- Fast Approximate Membership Filters (Java)☆259Updated 2 weeks ago
- Tools to work with off-heap memory using sun.misc.Unsafe☆136Updated 8 years ago
- Euphoria is an open source Java API for creating unified big-data processing flows. It provides an engine independent programming model w…☆82Updated 3 years ago
- Tools for parsing, creating and doing other fun stuff with sstables☆163Updated 8 years ago
- Schedoscope is a scheduling framework for painfree agile development, testing, (re)loading, and monitoring of your datahub, lake, or what…☆96Updated 6 years ago