SparkStreaming中利用MySQL保存Kafka偏移量保证0数据丢失
☆44Aug 2, 2017Updated 8 years ago
Alternatives and similar repositories for spark_streaming_kafka_offset
Users that are interested in spark_streaming_kafka_offset are comparing it to the libraries listed below
Sorting:
- Kafka delivery semantics in the case of failure depend on how and when offsets are stored. Spark output operations are at-least-once. So …☆37Apr 19, 2017Updated 8 years ago
- Code for processing AVRO data in Spark Streaming + Kafka (DirectKafka approach with custom offset management in zookeeper)☆29Sep 9, 2016Updated 9 years ago
- 请求spark rest API获取applications,jobs,stages,executors,rdds,streaming,environment等信息提供监控和报警服务☆11Nov 22, 2018Updated 7 years ago
- ☆12May 11, 2016Updated 9 years ago
- 手动管理spark streaming集成kafka的数据偏移量到zookeeper中☆21Jul 6, 2018Updated 7 years ago
- spark将hdfs数据高性能灌入kafka,然后spark streaming/structured streaming高速消费,关注性能,欢迎提供性能/代码优化建议☆32Mar 24, 2019Updated 6 years ago
- 基于袋鼠云提供的开源flinkStreamSQL项目,对其实时sql进行可视化功能开发;通过tcpip通信,前端页面选择需要连接的数据库信息,并写sql语句,点击提交后,后端自动执行集群启动和JobGraph提交,并返回结果给前端页面。实现了使用者即使不了解Kafka、fl…☆11Jun 23, 2019Updated 6 years ago
- Scalable CDC Pattern Implemented using PySpark☆18Oct 8, 2025Updated 4 months ago
- DirectKafka examples for Spark Streaming : 1. with checkpointing 2. Custom offset management☆60Sep 9, 2016Updated 9 years ago
- 使用spark对hive、hbase、ES的读写, 实现一次配置可对不同数据库进行导入导出,并对ES、hbase进行封装☆33May 6, 2017Updated 8 years ago
- My branch of Apache Flume with a generic JDBC sink (not yet licensed to Apache)☆11Feb 12, 2022Updated 4 years ago
- SparkStreaming项目,显示flume->Kafka->Spark->hbase(实时数据处理方案),Scala实现☆36Feb 19, 2018Updated 8 years ago
- Encapsulated spark 与其他组件的结合api,方便使用,例如 es,hbase,kudu,kafka,mq等☆35Dec 18, 2019Updated 6 years ago
- 基于TBSchedule开发的一个分布式任务调度框架,可以解析任务间的依赖,并执行任务(执行Shell、bat脚本)☆12Aug 5, 2016Updated 9 years ago
- hive sql parser☆11Aug 27, 2014Updated 11 years ago
- Implementation of a Big Data (batch and stream) distributed processing engine in Java using Akka actors.☆12Feb 20, 2023Updated 3 years ago
- 1.Spark离线批处理,用户实时点击统计;2.SparkSQL日志内容分析;3.受众电影分析 =>(Kafka + SparkStreaming + Redis)和(Kafka + SparkStreaming + Mysql)☆29Jun 21, 2022Updated 3 years ago
- 离线调度, hive, 任务依赖, 任务调度, 大数据开发平台☆14May 10, 2018Updated 7 years ago
- Showing the relationship between ImageNet ID and labels and pytorch pre-trained model output ID and labels☆10Oct 11, 2020Updated 5 years ago
- 实时分析nginx日志,计算接口访问次数,uv,时延,异常IP等指标☆29Apr 14, 2017Updated 8 years ago
- spark流数据处理,可以从flume-ng,kafka接收数据☆11Sep 16, 2015Updated 10 years ago
- better performance for kylin query☆15Jun 14, 2019Updated 6 years ago
- spring-boot利用scala写spark程序骨架☆28Oct 22, 2019Updated 6 years ago
- Apache flink☆18Feb 8, 2023Updated 3 years ago
- 在公司接了一个任务,完成一个项目数据同步的模块。要求是不能操作项目的数据库。怕操作不当,数据丢失。所以想到的方案是使用log4jdbc记录数据源的SQL语句到日志文件。然后按行读取日志文件中的数据,记录读取的Point,以便下次继续读取。读取的数据进入bigqueue队列,…☆12Aug 10, 2017Updated 8 years ago
- Ansible playbooks to help to deploy Apache Hadoop,Spark,Storm,Zookeeper,Elasticsearch,Azkaban,Flume,Hbase,Kafka,Kibana,Logstash☆10Mar 21, 2017Updated 8 years ago
- 基于PowerCenter的数据质量监控系统☆13Dec 27, 2017Updated 8 years ago
- Pinot 是一个实时分布式的 OLAP 数据存储和分析系统。LinkedIn 使用它实现低延迟可伸缩的实时分析。Pinot 从离线数据源(包括 Hadoop 和各类文件)和在线数据源(如 Kafka)中攫取数据进行分析。Pinot 被设计是可以进行水平扩展的☆16Nov 8, 2015Updated 10 years ago
- Get the China Stock market's DDE data, store it in the Mysql, use nodes. Prepare for the Stock's analysis。☆14Oct 25, 2015Updated 10 years ago
- Spark structured-streaming 消费kafka数据写入hbase☆33Jan 22, 2019Updated 7 years ago
- Streaming Analytics platform, built with Apache Flink and Kafka☆36Oct 6, 2023Updated 2 years ago
- Kafka stream for Spark with storage of the offsets in ZooKeeper☆60Apr 18, 2017Updated 8 years ago
- 基于 spark 混合查询平台,支持不同源数据库的联合查询,mysql hive presto ...☆14Aug 3, 2017Updated 8 years ago
- Clickhouse sink for akka-streams☆16Jan 3, 2022Updated 4 years ago
- ☆14Oct 5, 2022Updated 3 years ago
- Define and schedule workflow, support Flink Jar/SQL, ClickHouse/Hive/Mysql SQL, Shell, etc.☆20Updated this week
- High Performance Kafka Connector for Spark Streaming.Supports Multi Topic Fetch, Kafka Security. Reliable offset management in Zookeeper.…☆636Feb 26, 2022Updated 4 years ago
- Plot live-stats as graph from ApacheSpark application using Lightning-viz☆18Jul 3, 2017Updated 8 years ago
- kudu学习的一些资料,以及和spark/impala的集成使用☆33Sep 11, 2017Updated 8 years ago