asavinov / bistro
A general-purpose data analysis engine radically changing the way batch and stream data is processed
☆7Updated 6 years ago
Alternatives and similar repositories for bistro:
Users that are interested in bistro are comparing it to the libraries listed below
- Time series analysis with Apache Spark based on Chronix |☆38Updated 8 years ago
- A blazing fast ACID compliant NoSQL DataLake with support for storing 17 formats of data. Full SQL and DML capabilities along with Java s…☆175Updated last year
- Feature engineering and machine learning: together at last!☆24Updated 4 years ago
- The Chronix Server implementation that is based on Apache Solr.☆265Updated 5 years ago
- Arc is an opinionated framework for defining data pipelines which are predictable, repeatable and manageable.☆169Updated last year
- UI for interactive data analysis | https://snorkel.logv.org☆163Updated last year
- Wayeb is a Complex Event Processing and Forecasting (CEP/F) engine written in Scala.☆149Updated last year
- ☆27Updated 2 years ago
- Myria is a scalable Analytics-as-a-Service platform based on relational algebra.☆113Updated 3 years ago
- Go implementation of MIDAS: Microcluster-Based Detector of Anomalies in Edge Streams☆188Updated 4 years ago
- Quickly detect already witnessed data.☆158Updated 8 months ago
- A collection of datasets and databases☆24Updated 6 years ago
- opensource distributed database with base JPA implementation and event processing support☆74Updated 11 months ago
- A simple data consistency checker☆30Updated 8 years ago
- invesdwin-context modules that provide persistence features☆43Updated this week
- A column oriented, embarrassingly distributed relational event database.☆240Updated 6 years ago
- A totally proof-of-concept FoundationDB based network block device backend☆115Updated 6 years ago
- Probabilistic data structures server. The data model is key-value, where values are: Bloomfilters, LinearCounters, HyperLogLogs, CountMin…☆25Updated 9 years ago
- A platform for real-time streaming search☆103Updated 9 years ago
- Consus is a geo-replicated transactional key-value store.☆225Updated 6 years ago
- A Directed Acyclic Graph task dependency scheduler designed to simplify complex distributed pipelines☆131Updated 6 years ago
- Tools for working with parquet, impala, and hive☆134Updated 4 years ago
- Query engine for TrailDB☆51Updated 6 years ago
- Data Pipeline Clientlib provides an interface to tail and publish to data pipeline topics.☆110Updated 2 years ago
- Bullet is a streaming query engine that can be plugged into any singular data stream using a Stream Processing framework like Apache Stor…☆41Updated 2 years ago
- A decisioning and response platform☆70Updated 3 years ago
- Lossy Counting and Sticky Sampling implementation for efficient frequency counts on data streams.☆63Updated 8 years ago
- Doradus is a REST service that extends a Cassandra NoSQL database with a graph-based data model, advanced indexing and search features, a…☆204Updated 9 years ago
- Use SQL to transform your avro schema/records☆28Updated 7 years ago
- A highly configurable Google Cloud Dataflow pipeline that writes data into Google Big Query table from Pub/Sub☆67Updated 6 years ago