PeachstoneIO / peachboxLinks
Python based data warehouse solution for the Lambda Architecture.
☆14Updated 10 years ago
Alternatives and similar repositories for peachbox
Users that are interested in peachbox are comparing it to the libraries listed below
Sorting:
- Resize image on the fly using flask, zappa, pillow, opencv-python☆18Updated 7 years ago
- Utilities and examples to asssist in working with PySpark and Cassandra.☆36Updated 10 years ago
- Apache Nutch fork tunned for web services and data discovery.☆10Updated 10 years ago
- Task Orchestration Tool Based on SWF and boto3☆38Updated 6 years ago
- Functional Airflow DAG definitions.☆38Updated 7 years ago
- Python SDK for working with Snowplow enriched events in Spark, AWS Lambda et al.☆21Updated 7 months ago
- Data science repo to help others☆12Updated 9 years ago
- Light python framework for AWS SWF☆8Updated 9 years ago
- Arbalest is a Python data pipeline orchestration library for Amazon S3 and Amazon Redshift. It automates data import into Redshift and ma…☆41Updated 9 years ago
- Scheduled task execution on top of AWS Data Pipeline☆43Updated 10 years ago
- Dockerfiles for dockerhub automated build.☆7Updated 6 years ago
- WaterButler is a Python web application for interacting with various file storage services via a single RESTful API, developed at Center …☆62Updated this week
- A Python library for dealing with splittable files☆42Updated 5 years ago
- Infrastructure setup.☆10Updated 5 years ago
- Real time and offline time series analysis with Spark, Spark Streaming and Storm☆21Updated 4 years ago
- This project contains the code to translate between Apache Spark and SFrame.☆20Updated 8 years ago
- Proposals for new Jupyter subprojects to enter into incubation☆18Updated 4 years ago
- Apache Spark AWS Lambda Executor (SAMBA)☆44Updated 6 years ago
- Hadoop jobs for WikiReverse project. Parses Common Crawl data for links to Wikipedia articles.☆38Updated 6 years ago
- Apache Zeppelin on Kubernetes.☆28Updated 6 years ago
- Utils around luigi.☆66Updated 4 years ago
- Python to Gremlin Graph Abstraction Layer☆55Updated 7 years ago
- Deprecated, please use https://github.com/jcrist/skein or https://github.com/dask/dask-yarn instead☆52Updated 6 years ago
- Luigi Plugin for Hubot☆36Updated 8 years ago
- Python binding for gumbo-parser using Cython☆14Updated 8 years ago
- Airflow plugin to transfer arbitrary files between operators☆78Updated 6 years ago
- A platform for real-time streaming search☆102Updated 9 years ago
- Example of processing Kafka messages via Storm with Python ShellBolts☆11Updated 10 years ago
- Library for building reproducible data pipelines to support experimentation☆20Updated 9 years ago
- Chef cookbook for the http://druid.io/☆10Updated 9 years ago