lintool / my-data-is-bigger-than-your-dataLinks
My data is bigger than your data!
☆39Updated last month
Alternatives and similar repositories for my-data-is-bigger-than-your-data
Users that are interested in my-data-is-bigger-than-your-data are comparing it to the libraries listed below
Sorting:
- A Cascading Workflow Visualizer☆83Updated 2 years ago
- DEPRECATED A/B experiments service☆34Updated last month
- NuCypher for Kafka. Start building from this module (it fetches the appropriate branch from Kafka repository)☆18Updated 8 years ago
- Supporting material (code, schemas etc) for Unified Log Processing (Manning Publications)☆98Updated 3 years ago
- Integration of Samza and Luwak☆100Updated 11 years ago
- Bullet is a streaming query engine that can be plugged into any singular data stream using a Stream Processing framework like Apache Stor…☆41Updated 3 years ago
- Cantor provides utilities for estimating the cardinality of large sets.☆84Updated 3 years ago
- A distributed queue built off cassandra☆51Updated 9 years ago
- ☆49Updated 8 years ago
- Cascading on Apache Flink®☆54Updated last year
- Keynote for QCon SF 2015!☆38Updated 9 years ago
- @MissAmyTobey Writes☆49Updated 3 weeks ago
- A template-based cluster provisioning system☆61Updated 2 years ago
- Muppet☆128Updated 4 years ago
- ☆74Updated 7 years ago
- Pig on Apache Spark☆82Updated 10 years ago
- Compare eventual consistency of object stores☆178Updated last year
- A nozzle to spray a kafka topic at an HTTP endpoint. This project is deprecated and not maintained.☆49Updated 6 years ago
- On demand presto cluster with mesos, marathon and docker.☆29Updated 7 years ago
- A tutorial that explains how to build a simple distributed fault-tolerant framework on top of Mesos☆47Updated 3 years ago
- Query testing framework☆72Updated last month
- Serving system for batch generated data sets☆178Updated 8 years ago
- Lossy Counting and Sticky Sampling implementation for efficient frequency counts on data streams.☆63Updated 9 years ago
- s3mper - Consistent Listing for S3☆232Updated 2 years ago
- recordbus: mysql binlog to apache kafka☆80Updated 10 years ago
- All development now happens over here: https://github.com/cwensel/cascading. Cascading is a feature rich API for defining and executing c…☆332Updated 7 years ago
- A gRPC service which proxies requests to an HTTP server.☆25Updated 8 years ago
- A tool for running Spark on Google Compute Engine☆16Updated 9 years ago
- Explorations relative to cloning FlumeJava☆94Updated 5 years ago
- This project allows to run Samza jobs on Mesos cluster☆43Updated 4 years ago