sailthru / stolos
A Directed Acyclic Graph task dependency scheduler designed to simplify complex distributed pipelines
☆130Updated 6 years ago
Related projects ⓘ
Alternatives and complementary repositories for stolos
- Data Pipeline Clientlib provides an interface to tail and publish to data pipeline topics.☆109Updated 2 years ago
- Streaming estimation of percentiles, especially high percentiles.☆63Updated 12 years ago
- Tools for working with parquet, impala, and hive☆134Updated 3 years ago
- C network daemon for HyperLogLogs☆449Updated 3 years ago
- High Throughput Real-time Stream Processing Framework☆284Updated 7 years ago
- Tools for writing, submitting, debugging, and monitoring Storm topologies in pure Python☆247Updated last year
- A general-purpose data analysis engine radically changing the way batch and stream data is processed☆7Updated 6 years ago
- C++ native client for Impala and Hive, with Python / pandas bindings☆73Updated 6 years ago
- ☆110Updated 7 years ago
- Sparrow scheduling platform (U.C. Berkeley).☆319Updated 4 years ago
- Lossy Counting and Sticky Sampling implementation for efficient frequency counts on data streams.☆62Updated 8 years ago
- A Real-Time Analytical Processing (RTAP) example using Spark/Shark☆51Updated 10 years ago
- Compilation and rule-based optimization framework for relational algebra. Raco is the language, optimization, and query translation layer…☆72Updated 6 years ago
- A platform for real-time streaming search☆103Updated 8 years ago
- Simulating the performance of various streaming algorithms. #experimentalmathematics☆59Updated 6 years ago
- Luigi Plugin for Hubot☆35Updated 8 years ago
- A column oriented, embarrassingly distributed relational event database.☆240Updated 6 years ago
- Myria is a scalable Analytics-as-a-Service platform based on relational algebra.☆113Updated 3 years ago
- pesos is a pure python implementation of the mesos framework api☆47Updated 9 years ago
- A RESTful web service that runs microtasks across multiple crowds, provides quality control techniques, and is easily extensible.☆51Updated 7 years ago
- ☆92Updated 9 years ago
- Quark is a data virtualization engine over analytic databases.☆99Updated 7 years ago
- ☆146Updated 8 years ago
- Partitioned storage system based on blosc. **No longer actively maintained.**☆153Updated 8 years ago