GoogleCloudPlatform / Data-PipelineLinks
Data pipeline is a tool to run Data loading pipelines. It is an open sourced app engine app that users can extend to suit their own needs. Out of the box it will load files from a source, transform them and then output them (output might be writing to a file or loading them into a data analysis tool). It is designed to be modular and support var…
☆87Updated 11 years ago
Alternatives and similar repositories for Data-Pipeline
Users that are interested in Data-Pipeline are comparing it to the libraries listed below
Sorting:
- Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.☆164Updated 8 years ago
- Opinion Analysis of News, Threaded Conversations, and User Generated Content☆106Updated last year
- Simplest way to get Tweets into BigQuery. Uses Google Cloud & App Engine, as well as Python and D3.☆143Updated 9 years ago
- Example Kubernetes app that shows how to build a 'pipeline' to stream data into BigQuery. Uses Redis or Google Cloud PubSub☆131Updated 5 years ago
- This service is meant to simplify running Google Cloud operations, especially BigQuery tasks. This means you do not have to worry about …☆46Updated 6 years ago
- Repository with examples and smoke tests for the GCP Airflow operators and hooks☆152Updated 8 years ago
- ☆84Updated 7 years ago
- ☆54Updated 8 years ago
- Google Datalab Library☆192Updated 3 years ago
- makeViewerUrl☆90Updated last year
- An example that shows how to periodically launch a Dataflow analytics pipeline from GAE Flex, that reads from Datastore.☆42Updated 8 years ago
- This is the support code and solutions for the NYC Taxi Tycoon Dataflow Codelab☆63Updated 6 years ago
- *luigi-gcloud* is an luigi extension that enables full support for the Google Cloud Platform. Making it possible to do complex orchestrat…☆43Updated 9 years ago
- Processing Logs at Scale using Cloud Dataflow☆61Updated 6 years ago
- Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines. This re…☆167Updated 7 years ago
- ☆121Updated 10 years ago
- Simple Python client for interacting with Google BigQuery.☆460Updated 4 years ago
- A curated list of all the awesome examples, articles, tutorials and videos for Apache Airflow.☆96Updated 4 years ago
- Google Container Engine, JupyterHub, and Jupyter for classroom scenarios☆59Updated 8 years ago
- Data and code for "Fast Data Applications with Spark and Python"☆25Updated 9 years ago
- ☆144Updated 5 years ago
- Example that shows use of the Prediction API☆10Updated 7 years ago
- An external PySpark module that works like R's read.csv or Panda's read_csv, with automatic type inference and null value handling. Parse…☆90Updated 10 years ago
- Example stream processing job, written in Scala with Apache Beam, for Google Cloud Dataflow☆30Updated 8 years ago
- Airflow workflow management platform chef cookbook.☆71Updated 6 years ago
- Replicates data between Google Cloud BigQuery projects☆22Updated 9 years ago
- ☆239Updated 7 years ago
- Examples of how to use Cloud Bigtable both with GCE map/reduce as well as stand alone applications.☆232Updated 3 weeks ago
- ☆71Updated 10 years ago
- SQL Recipes for Web Analytics☆34Updated 10 years ago