sciabarra / BigDataDevKitLinks
Big Data Development Kit (Hadoop / Spark / Zeppelin / IntelliJ)
☆22Updated 9 years ago
Alternatives and similar repositories for BigDataDevKit
Users that are interested in BigDataDevKit are comparing it to the libraries listed below
Sorting:
- TensorFlow Processor for Spring Cloud Dataflow☆24Updated 8 years ago
- A template-based cluster provisioning system☆61Updated 2 years ago
- A DC/OS time series demo☆62Updated 9 years ago
- ☆9Updated 9 years ago
- A docker image for testing MemSQL + MemSQL Ops☆68Updated 2 years ago
- Example code for building your own MemSQL Streamliner Pipelines☆23Updated 8 years ago
- Bullet is a streaming query engine that can be plugged into any singular data stream using a Stream Processing framework like Apache Stor…☆41Updated 2 years ago
- Kubernetes demos☆16Updated 7 years ago
- Very basic web app project that grabs a twitter stream and runs it through Stanfords Core NLP☆10Updated 9 years ago
- Example application demonstrating how to integrate all of the components of Hortonworks DataFlow.☆14Updated 7 years ago
- On demand presto cluster with mesos, marathon and docker.☆30Updated 7 years ago
- Few things we've met during our etl project based on spark☆24Updated 7 years ago
- ☆24Updated 9 years ago
- CDAP Applications☆43Updated 7 years ago
- AWS Lambda and Java version of Eventuate Todo list application☆26Updated 8 years ago
- Code and setup information for Introduction to Machine Learning with Spark☆12Updated 9 years ago
- Docker Image and Kubernetes Configurations for Spark 2.x☆41Updated 5 years ago
- Tutorials for Cascading, Lingual, Pattern and other projects☆18Updated 8 years ago
- Java and Scala client libraries for Concord☆13Updated 8 years ago
- A big data cluster management tool that creates and manages clusters of different technologies.☆21Updated 10 years ago
- An Operator for scheduling and executing NiFi Flows as Jobs on Kubernetes☆53Updated 4 years ago
- Platform documentation☆16Updated 9 years ago
- Starter project for building MemSQL Streamliner Pipelines☆32Updated 8 years ago
- Real time and offline time series analysis with Spark, Spark Streaming and Storm☆21Updated 4 years ago
- A collection of datasets and databases☆24Updated 7 years ago
- Telecom scenarios implemented with streaming techniques☆11Updated last year
- Apache Yarn cluster docker image☆35Updated 7 years ago
- Herd-MDL, a turnkey managed data lake in the cloud. See https://finraos.github.io/herd-mdl/ for more information.☆16Updated 10 months ago
- Terraform samples to sync and use IAM users ssh keys to connect to EC2 instances☆13Updated 6 years ago
- Code examples supporting the "Introduction to Apache Spark" video published by O'Reilly Media☆37Updated 2 years ago