nathanmarz / cascading-batch-queryLinks
Optimized joins using bloom filters on Hadoop via Cascading.
☆23Updated 15 years ago
Alternatives and similar repositories for cascading-batch-query
Users that are interested in cascading-batch-query are comparing it to the libraries listed below
Sorting:
- Apache Calcite Tutorial☆33Updated 8 years ago
- SQL Windowing Functions for Hadoop☆65Updated 2 years ago
- Fast JVM collection☆60Updated 10 years ago
- Idempotent query executor☆51Updated last month
- A library for strong, schema based conversion between 'natural' JSON documents and Avro☆18Updated last year
- Bloofi: A java implementation of multidimensional Bloom filters☆79Updated 9 years ago
- Cascading on Apache Flink®☆54Updated last year
- Mirror of Apache HCatalog☆59Updated 2 years ago
- UberScriptQuery, a SQL-like DSL to make writing Spark jobs super easy☆61Updated last year
- Helpful user defined fuctions / table generating functions for Hive☆101Updated 9 years ago
- Stratosphere is now Apache Flink.☆197Updated last year
- Mirror of Apache DirectMemory☆52Updated last year
- Explorations relative to cloning FlumeJava☆93Updated 4 years ago
- JDBC driver that converts any INSERT, UPDATE and DELETE statements into append-only INSERTs. Instead of updating rows in-place it inserts…☆80Updated 8 years ago
- Flink performance tests☆20Updated 9 years ago
- Quark is a data virtualization engine over analytic databases.☆98Updated 7 years ago
- Distributed Java Collections for ZooKeeper☆110Updated 8 years ago
- Secondary sort and streaming reduce for Apache Spark☆78Updated last year
- Cascading is a feature rich API for defining and executing complex and fault tolerant data processing flows locally or on a cluster.☆350Updated last month
- Benchmark suite for data compression library on the JVM☆217Updated 11 months ago
- Simple Samza Job Using Confluent Platform☆14Updated 9 years ago
- SamzaSQL: Streaming SQL implementation on top of Apache Samza and Apache Kafka☆29Updated 8 years ago
- ☆13Updated 7 years ago
- Provides a SQL interface to your TinkerPop enabled graph db☆74Updated last year
- A bunch of utility classes for Java, Hadoop, HBase, Pig, etc.☆76Updated 11 years ago
- Schema and type system for creating sortable byte[]☆46Updated 12 years ago
- Realtime Analytics☆41Updated 13 years ago
- A sink to save Spark Structured Streaming DataFrame into Hive table☆23Updated 7 years ago
- Port of TPC-H dbgen to Java☆50Updated 7 months ago
- XPath likeness for Avro☆35Updated 2 years ago