mladkov / spark-kudu-up-and-running
Spark on Kudu up and running samples
☆10Updated 8 years ago
Alternatives and similar repositories for spark-kudu-up-and-running:
Users that are interested in spark-kudu-up-and-running are comparing it to the libraries listed below
- ☆8Updated 6 years ago
- Module for flume, allows to write incoming events directly to OpenTSDB.☆10Updated 8 years ago
- A light Kafka to HDFS/S3 ETL library based on Apache Spark☆41Updated 7 years ago
- Apache Flink as a Cloudera Manager Service☆12Updated 8 years ago
- A small project to show how to add lineage to Atlas when using Spark as ETL tool☆12Updated 8 years ago
- Spark Example using Phoenix to interact with HBase☆16Updated 8 years ago
- Reads a HBase table and writes the out as Text, Seq, Avro, or Parquet☆28Updated 10 years ago
- Foodmart data set in MySQL format☆10Updated last year
- ElasticSearch integration for Apache Spark☆47Updated 8 years ago
- High performance HBase / Spark SQL engine☆28Updated 2 years ago
- Ambari service for Apache Drill☆17Updated 8 years ago
- A sink to save Spark Structured Streaming DataFrame into Hive table☆23Updated 6 years ago
- Ambari stack service for easily installing and managing Solr on HDP cluster☆19Updated 6 years ago
- Schema Registry integration for Apache Spark☆40Updated 2 years ago
- Kafka, Spark Streaming, Kudu integration examples☆17Updated 7 years ago
- Prescriptive Applications over Kite and Hadoop☆12Updated 9 years ago
- MySQL to NoSQL real time dataflow☆18Updated 7 years ago
- Sample Spark Streaming application for secure consumption from Kafka☆33Updated 7 years ago
- Using Spark SQLContext, HiveContext & Spark DataFrames API with ElasticSearch, Cassandra & MongoDB☆22Updated 8 years ago
- Real-time analytics in Apache Flume☆52Updated 9 years ago
- Cascading on Apache Flink®☆54Updated last year
- Collection of generic Apache Flink operators☆17Updated 7 years ago
- Ansible playbook for automated HDP 2.x deployment install with Kerberos☆19Updated 8 years ago
- Flink Examples☆39Updated 8 years ago
- Open-source distribute workflow schedule tools, also support streaming task.☆38Updated 7 years ago
- Quickly deploy Hadoop with the help of Ansible and Apache Ambari☆37Updated 9 years ago
- The code for the in memory data pipeline that was presented at Berlin Buzzwords 2015.☆10Updated 9 years ago
- A library for financial and time series calculations on Apache Spark☆28Updated 9 years ago
- ☆11Updated 9 years ago
- Scalable CDC Pattern Implemented using PySpark☆18Updated 5 years ago