lintool / my-data-is-bigger-than-your-dataLinks
My data is bigger than your data!
☆39Updated 6 years ago
Alternatives and similar repositories for my-data-is-bigger-than-your-data
Users that are interested in my-data-is-bigger-than-your-data are comparing it to the libraries listed below
Sorting:
- A Cascading Workflow Visualizer☆83Updated 2 years ago
- ☆49Updated 8 years ago
- @MissAmyTobey Writes☆49Updated 2 years ago
- Cantor provides utilities for estimating the cardinality of large sets.☆83Updated 3 years ago
- Bullet is a streaming query engine that can be plugged into any singular data stream using a Stream Processing framework like Apache Stor…☆41Updated 2 years ago
- A distributed queue built off cassandra☆51Updated 9 years ago
- Cascading on Apache Flink®☆54Updated last year
- Supporting material (code, schemas etc) for Unified Log Processing (Manning Publications)☆98Updated 3 years ago
- Myria is a scalable Analytics-as-a-Service platform based on relational algebra.☆117Updated 3 years ago
- Lossy Counting and Sticky Sampling implementation for efficient frequency counts on data streams.☆63Updated 9 years ago
- NuCypher for Kafka. Start building from this module (it fetches the appropriate branch from Kafka repository)☆18Updated 7 years ago
- Timberlake is a Job Tracker for Hadoop.☆177Updated 5 years ago
- A/B experiments service☆34Updated 4 months ago
- Compare eventual consistency of object stores☆174Updated last year
- Apache Yarn cluster docker image☆35Updated 7 years ago
- This project allows to run Samza jobs on Mesos cluster☆43Updated 4 years ago
- A nozzle to spray a kafka topic at an HTTP endpoint. This project is deprecated and not maintained.☆49Updated 5 years ago
- Keynote for QCon SF 2015!☆38Updated 9 years ago
- A tutorial that explains how to build a simple distributed fault-tolerant framework on top of Mesos☆47Updated 2 years ago
- Muppet☆127Updated 4 years ago
- Query testing framework☆71Updated 2 months ago
- ☆74Updated 6 years ago
- Integration of Samza and Luwak☆100Updated 10 years ago
- An integration framework that allows you to run and manage CrateDB via Apache Mesos.☆23Updated 6 years ago
- Probabilistic data structures for Guava.☆54Updated 4 years ago
- All development now happens over here: https://github.com/cwensel/cascading. Cascading is a feature rich API for defining and executing c…☆332Updated 6 years ago
- A gRPC service which proxies requests to an HTTP server.☆25Updated 7 years ago
- Explorations relative to cloning FlumeJava☆93Updated 4 years ago
- Fabric-based framework for deploying and managing SolrCloud clusters in the cloud.☆90Updated 6 years ago
- A simple tool that finds serious bugs in Java exception handler☆125Updated 8 years ago