spider-123-eng / Hive-Pig-Hbase
Hive,Pig,Hbase,Sqoop examples
☆16Updated 7 years ago
Related projects ⓘ
Alternatives and complementary repositories for Hive-Pig-Hbase
- Few scripts to automate daily data loads from RDBMS to Partitioned Avro Hive table☆29Updated 10 years ago
- 用户画像代码,根据算法推算出用户的性别和年龄比率☆11Updated 6 years ago
- dw etl 工具 mysql 增量、全量抽取 to hive. 合并 hive 数据表, 等数据平台清洗工具☆9Updated 7 years ago
- ☆8Updated 6 years ago
- Ansible playbooks to help to deploy Apache Hadoop,Spark,Storm,Zookeeper,Elasticsearch,Azkaban,Flume,Hbase,Kafka,Kibana,Logstash☆10Updated 7 years ago
- 【易车】- Spark、flink、HBase、Hive、flume集成了一些Hadoop的原生api的一些demo(如HDFS、MapReduce:目前就这两个);同时测试一些异常功能☆16Updated 5 years ago
- Streaming using Flink to connect Kafka and Elasticsearch☆29Updated 8 years ago
- Java Client of the Spark Job Server implementing the arranged Rest APIs☆51Updated 3 years ago
- Custom Spark Kafka consumer based on Kafka SimpleConsumer API.☆22Updated 10 years ago
- IoT Trucking App with Flink (with Table API & SQL)☆15Updated 6 years ago
- Ambari stack for easily installing and managing Redis on HDP cluster☆15Updated 9 years ago
- Demo showcasing Spark Streaming, Kafka, Kudu - all in Python☆27Updated 7 years ago
- POC for all the stack of big data (kafka, spark, cassandra, hdfs, docker, springboot)☆12Updated last year
- Apache Hudi Demo☆21Updated 4 months ago
- A light Kafka to HDFS/S3 ETL library based on Apache Spark☆41Updated 7 years ago
- Kafka Eagle used to describe the use of Wiki☆11Updated 4 years ago
- Kafka, Spark Streaming, Kudu integration examples☆17Updated 6 years ago
- A web application for submitting spark application☆8Updated 3 years ago
- Fast Data Cluster (Apache Cassandra, Kafka, Spark, Flink, YARN and HDFS with Vagrant and VirtualBox)☆22Updated last year
- DataFibers Data Service☆31Updated 2 years ago
- ServiceFramework 示例项目☆10Updated 8 years ago
- conbine flume,spark-streaming and redis for real-time computing☆22Updated 10 years ago
- flink-docker-compose-demo☆10Updated 6 years ago
- Spark1.6和spark2.2的示例,包含kafka,flume,structuredstreaming,jedis,elasticsearch,mysql,dataframe☆15Updated 6 years ago
- MySQL to NoSQL real time dataflow☆18Updated 7 years ago
- Pinot 是一个实时分布式的 OLAP 数据存储和分析系统。LinkedIn 使用它实现低延迟可伸缩的实时分析。Pinot 从离线数据源(包括 Hadoop 和各类文件)和在线数据源(如 Kafka)中攫取数据进行分析。Pinot 被设计是可以进行水平扩展的☆15Updated 9 years ago
- Code for processing AVRO data in Spark Streaming + Kafka (DirectKafka approach with custom offset management in zookeeper)☆29Updated 8 years ago
- 优化flink的多流操作(例如join),优化点不限于数据丢失问题,以及性能问题☆11Updated 5 years ago