sciabarra / BigDataDevKit
Big Data Development Kit (Hadoop / Spark / Zeppelin / IntelliJ)
☆22Updated 9 years ago
Alternatives and similar repositories for BigDataDevKit:
Users that are interested in BigDataDevKit are comparing it to the libraries listed below
- TensorFlow Processor for Spring Cloud Dataflow☆24Updated 7 years ago
- Geo-Located Data: Extracting Patterns from Mobile Data using Scikit-Learn and Cassandra☆29Updated 6 years ago
- ☆9Updated 9 years ago
- A collection of datasets and databases☆24Updated 6 years ago
- Starter project for building MemSQL Streamliner Pipelines☆32Updated 8 years ago
- AWS Lambda and Java version of Eventuate Todo list application☆26Updated 7 years ago
- Examples of Integrating Spark Streaming, Flume, and HBase to solve Streaming problems☆19Updated 11 years ago
- swblocks-jbl library is a set of core Java utilities based on Java 8 which provides as set of core error handling tools and additional ut…☆12Updated 4 years ago
- Kubernetes demos☆16Updated 7 years ago
- A template-based cluster provisioning system☆61Updated 2 years ago
- ☆21Updated 9 years ago
- Example code for building your own MemSQL Streamliner Pipelines☆23Updated 8 years ago
- A short kickstart project for working with Open Distro for Elasticsearch in a practical way. Load in podcast data from The Dollop and ana…☆12Updated 4 years ago
- Real time and offline time series analysis with Spark, Spark Streaming and Storm☆21Updated 4 years ago
- A scalable, distributed Time Series Database.☆28Updated 10 years ago
- Amazon Elastic MapReduce code samples☆63Updated 9 years ago
- Use cases built on SnappyData. Use cases contained here: 1. Ad Analytics 2. Streaming data ingestion from RabbitMQ.☆32Updated 2 years ago
- Bullet is a streaming query engine that can be plugged into any singular data stream using a Stream Processing framework like Apache Stor…☆41Updated 2 years ago
- AWS Big Data Certification☆25Updated 3 months ago
- Cloudformation and SQL scripts used to replicate a POC environment from the "Data Lake to Data Warehouse: Enhancing Customer 360 with Ama…☆31Updated 5 years ago
- machine learning playground☆12Updated 8 years ago
- ☆7Updated 9 years ago
- Optimizing downstream data processing with Amazon Kinesis Data Firehose and Amazon EMR running Apache Spark☆13Updated 2 years ago
- ☆10Updated 8 years ago
- Examples of all Machine Learning Algorithm in Apache Spark☆15Updated 7 years ago
- Mirror of Apache Beam☆10Updated 4 years ago
- Herd-MDL, a turnkey managed data lake in the cloud. See https://finraos.github.io/herd-mdl/ for more information.☆16Updated 9 months ago
- Herd-UI is a search and discovery tool for business and technical users. Everyone in your organization can use Herd-UI to browse and unde…☆16Updated 2 years ago
- A DC/OS time series demo☆62Updated 9 years ago
- 4-day deep dive of docker + kubernetes☆33Updated 8 years ago