googlegenomics / dataflow-java
Google Cloud Dataflow pipelines such as Identity-By-State as well as useful utility classes.
☆36Updated last year
Related projects: ⓘ
- Spark pipelines that correspond to a series of Dataflow examples.☆27Updated 5 years ago
- Apache Spark jobs such as Principal Coordinate Analysis.☆74Updated 7 years ago
- ☆14Updated this week
- Muppet☆126Updated 3 years ago
- Apache Beam Site☆29Updated last week
- Dockerflow is a workflow runner that uses Dataflow to run a series of tasks in Docker with the Pipelines API☆97Updated 6 years ago
- Efficient, distributed downloads of large files from S3 to HDFS using Spark.☆17Updated 7 years ago
- Spooker is a dynamic framework for processing high volume data streams via processing pipelines☆29Updated 8 years ago
- ☆37Updated this week
- Spark In MapReduce (SIMR) - launching Spark applications on existing Hadoop MapReduce infrastructure☆45Updated 2 years ago
- A package full of linear algebra operators for Apache Spark MLlib's linalg package☆10Updated 9 years ago
- Cascading on Apache Flink®☆54Updated 7 months ago
- Starter project for building MemSQL Streamliner Pipelines☆32Updated 7 years ago
- ☆13Updated this week
- ☆12Updated this week
- Apache Incubator Proposal for Heron☆22Updated 8 years ago
- Processing Logs at Scale using Cloud Dataflow☆61Updated 5 years ago
- Scriptable scheduler for periodical Hadoop workflows☆22Updated 6 years ago
- Miscellaneous functionality for manipulating Apache Spark RDDs.☆22Updated 5 years ago
- Examples of user defined functions for Apache Drill☆19Updated 7 years ago
- Visualize (.avdl and .proto format) schema files as a UML diagram using Graphviz☆29Updated 6 years ago
- Dockerfile for Apache Zeppelin☆17Updated 8 years ago
- Experiments with the GDELT dataset and Cassandra schemas.☆25Updated 8 years ago
- Accompanying repository for FIS/SunGard's whitepaper on using the Dataflow SDK to transform options market data☆22Updated 8 years ago
- CDAP Applications☆43Updated 6 years ago
- Example code for building your own MemSQL Streamliner Pipelines☆23Updated 7 years ago
- Secure Cloud Object REsource: file transfer microservice☆18Updated last week
- Mirror of Apache MRQL (Incubating)☆17Updated 7 years ago
- Cloud Pub/Sub sample applications with Java☆50Updated 4 years ago
- ☆18Updated 6 years ago