PeachstoneIO / peachboxLinks
Python based data warehouse solution for the Lambda Architecture.
☆14Updated 9 years ago
Alternatives and similar repositories for peachbox
Users that are interested in peachbox are comparing it to the libraries listed below
Sorting:
- Task Orchestration Tool Based on SWF and boto3☆38Updated 6 years ago
- Utilities and examples to asssist in working with PySpark and Cassandra.☆36Updated 10 years ago
- Data science repo to help others☆12Updated 9 years ago
- S3 backed ContentsManager for jupyter notebooks☆14Updated 9 years ago
- A Python library for dealing with splittable files☆42Updated 5 years ago
- Docker container for the latest prediction.io version with most recent dependencies☆11Updated 8 years ago
- Open source analytics platform powered by Apache Cassandra, Spark, and Kafka☆34Updated 10 years ago
- This project provides sequential pattern mining for Apache Spark. The algorithms are based on the work of Philippe Fournier-Viger and co…☆30Updated 10 years ago
- Dockerfiles for dockerhub automated build.☆7Updated 6 years ago
- workflow support for reproducible deduplication and merging☆16Updated last year
- Apache Zeppelin on Kubernetes.☆28Updated 6 years ago
- SQLAlchemy models and DDL and ERD generation from chop-dbhi/data-models style JSON endpoints.☆11Updated 2 years ago
- Luigi Plugin for Hubot☆36Updated 8 years ago
- This project contains the code to translate between Apache Spark and SFrame.☆20Updated 8 years ago
- JSON -> Relational DB Column Types☆63Updated 2 years ago
- PMML evaluator library for the Apache Hive data warehouse software (legacy codebase)☆13Updated 10 years ago
- Apache Nutch fork tunned for web services and data discovery.☆10Updated 10 years ago
- A DC/OS time series demo☆62Updated 9 years ago
- A place for all things Pivotal & R☆25Updated 3 years ago
- Example of processing Kafka messages via Storm with Python ShellBolts☆11Updated 10 years ago
- Real time and offline time series analysis with Spark, Spark Streaming and Storm☆21Updated 4 years ago
- Probabilistic Data Structures in Python (originally presented at PyData 2013)☆55Updated 3 years ago
- Sample repo for luigi tasks & config☆36Updated 9 years ago
- An implementation of the multi-armed bandit optimization pattern as a Flask extension☆81Updated last week
- Python bindings for Matroid API☆16Updated 5 months ago
- Proposals for new Jupyter subprojects to enter into incubation☆18Updated 4 years ago
- ☆12Updated 10 years ago
- Functional Airflow DAG definitions.☆38Updated 7 years ago
- A Python wrapper for MADlib(http://madlib.net) - an open source library for scalable in-database machine learning algorithms☆63Updated 4 years ago
- Latency numbers every data scientist should know (aka the pyramid of analytical tasks) - the order of magnitude of computational time for…☆20Updated 8 years ago