datahuborg / datahub
An experimental hosted platform (GitHub-like) for organizing, managing, sharing, collaborating, and making sense of data.
☆211Updated 6 years ago
Alternatives and similar repositories for datahub:
Users that are interested in datahub are comparing it to the libraries listed below
- An open-source, vendor-neutral data context service.☆159Updated 6 years ago
- Myria is a scalable Analytics-as-a-Service platform based on relational algebra.☆113Updated 3 years ago
- ☆92Updated 9 years ago
- MacroBase: A Search Engine for Fast Data☆664Updated 2 years ago
- BlinkDB: Sub-Second Approximate Queries on Very Large Data.☆660Updated 10 years ago
- A RESTful web service that runs microtasks across multiple crowds, provides quality control techniques, and is easily extensible.☆51Updated 7 years ago
- Simplifying robust end-to-end machine learning on Apache Spark.☆470Updated 7 years ago
- Lightweight Tableau-style interface for visual analysis, built on Vega-lite.☆371Updated 7 years ago
- Large scale query engine benchmark☆99Updated 8 years ago
- ☆110Updated 7 years ago
- ☆146Updated 8 years ago
- Web based interactive computing environment for H2O☆134Updated 3 months ago
- OrpheusDB☆85Updated 4 years ago
- A platform for visualization and real-time monitoring of data workflows☆1,173Updated 5 years ago
- ☆33Updated 10 years ago
- SDK for Turi's GraphLab Create.☆149Updated 7 years ago
- Enabling queries on compressed data.☆278Updated last year
- Quark is a data virtualization engine over analytic databases.☆98Updated 7 years ago
- Collecting thoughts about data versioning☆108Updated 5 years ago
- Apache Fluo☆188Updated 2 months ago
- BayesDB on SQLite. A Bayesian database table for querying the probable implications of data as easily as SQL databases query the data its…☆927Updated last year
- An efficient updatable key-value store for Apache Spark☆250Updated 7 years ago
- Scalable Machine Learning in Scalding☆360Updated 6 years ago
- A prototype of Hive UDFs/UDTFs that execute nested SQL queries within rows.☆54Updated 9 years ago
- A Python library for creating fast, repeatable and self-documenting data analysis pipelines.☆237Updated 11 months ago
- Warcbase is an open-source platform for managing analyzing web archives☆162Updated 7 years ago
- PredictionIO Python SDK☆196Updated 6 years ago
- Spark implementation of the Google Correlate algorithm to quickly find highly correlated vectors in huge datasets☆92Updated 9 years ago
- A platform for real-time streaming search☆103Updated 8 years ago
- Use Pentaho's open source data integration tool (Kettle) to create Extract-Transform-Load (ETL) processes to update a Socrata open data p…☆96Updated 8 years ago