lintool / my-data-is-bigger-than-your-dataLinks
My data is bigger than your data!
☆39Updated 5 years ago
Alternatives and similar repositories for my-data-is-bigger-than-your-data
Users that are interested in my-data-is-bigger-than-your-data are comparing it to the libraries listed below
Sorting:
- A java library for stored queries☆16Updated last year
- ☆49Updated 7 years ago
- Embedded Kafka for testing and quick prototyping.☆14Updated 9 years ago
- Spooker is a dynamic framework for processing high volume data streams via processing pipelines☆29Updated 9 years ago
- A Cascading Workflow Visualizer☆83Updated 2 years ago
- A template-based cluster provisioning system☆61Updated 2 years ago
- Keynote for QCon SF 2015!☆38Updated 8 years ago
- Java and Scala client libraries for Concord☆13Updated 8 years ago
- Atomix Jepsen tests☆14Updated 8 years ago
- ☆43Updated 3 years ago
- HBase as a JSON Document Database☆25Updated last year
- Muppet☆126Updated 4 years ago
- NuCypher for Kafka. Start building from this module (it fetches the appropriate branch from Kafka repository)☆18Updated 7 years ago
- Cascading on Apache Flink®☆54Updated last year
- A RESTful web service that runs microtasks across multiple crowds, provides quality control techniques, and is easily extensible.☆51Updated 7 years ago
- Bullet is a streaming query engine that can be plugged into any singular data stream using a Stream Processing framework like Apache Stor…☆41Updated 2 years ago
- A framework to benchmark different graph databases, based on generated data from customizable schema, distribution, and size.☆25Updated 6 years ago
- Generates 27-character, time-ordered, k-sortable, URL-safe, globally unique identifiers.☆26Updated 6 years ago
- Spash☆24Updated 9 years ago
- A/B experiments service☆33Updated 3 weeks ago
- Time series analysis with Apache Spark based on Chronix |☆38Updated 8 years ago
- HDFS compatible Distributed Filesystem backed Cassandra☆25Updated 9 years ago
- Port of Twitter's Scala JVM-profiler to Java☆15Updated 2 years ago
- A prototype of Hive UDFs/UDTFs that execute nested SQL queries within rows.☆54Updated 9 years ago
- Github mirror of "analytics/kafkatee" - our actual code is hosted with Gerrit (please see https://www.mediawiki.org/wiki/Developer_access…☆21Updated last year
- A tool that records instantaneous linux load (runnabel thread count) in 1mec intervals and logs it in jHiccup-like format☆20Updated 8 years ago
- Cantor provides utilities for estimating the cardinality of large sets.☆83Updated 3 years ago
- Examples of user defined functions for Apache Drill☆18Updated 8 years ago
- A scalable, distributed Time Series Database.☆28Updated 10 years ago
- Embedded PostgreSQL server for use in tests☆9Updated 4 years ago