A library for querying Binlog with Apache Spark structure streaming, for Spark SQL , DataFrames and [MLSQL](https://www.mlsql.tech).
☆152Apr 21, 2023Updated 2 years ago
Alternatives and similar repositories for spark-binlog
Users that are interested in spark-binlog are comparing it to the libraries listed below
Sorting:
- A library based on delta for Spark and MLSQL☆60Dec 24, 2020Updated 5 years ago
- ☆13Jun 17, 2022Updated 3 years ago
- Big data smart alarm by sql☆12May 11, 2021Updated 4 years ago
- Byzer (former MLSQL): A low-code open-source programming language for data pipeline, analytics and AI.☆1,845May 29, 2024Updated last year
- My Blog☆76May 3, 2018Updated 7 years ago
- An ad hoc query service based on the spark sql engine.(基于spark sql引擎的即席查询服务)☆381Dec 16, 2023Updated 2 years ago
- ☆22Jun 21, 2022Updated 3 years ago
- fast spark local mode☆35Aug 20, 2018Updated 7 years ago
- 简单易用的ETL工具☆17Mar 28, 2019Updated 6 years ago
- Stream computing platform for bigdata☆408Apr 24, 2024Updated last year
- This is a library for SQL optimizing/rewriting including Materialized View rewrite☆69Jun 21, 2022Updated 3 years ago
- Wormhole is a SPaaS (Stream Processing as a Service) Platform☆977Nov 16, 2022Updated 3 years ago
- A library based on Hudi for Spark.☆10Nov 30, 2021Updated 4 years ago
- Unified SQL Analytics Engine Based on SparkSQL☆211Dec 5, 2022Updated 3 years ago
- ☆11Nov 16, 2022Updated 3 years ago
- spark structured streaming via HTTP communication☆18Jul 7, 2022Updated 3 years ago
- Moonbox is a DVtaaS (Data Virtualization as a Service) Platform☆506Apr 14, 2023Updated 2 years ago
- The Apache Spark - Apache HBase Connector is a library to support Spark accessing HBase table as external data source or sink.☆550May 10, 2021Updated 4 years ago
- SparkCube is an open-source project for extremely fast OLAP data analysis. SparkCube is an extension of Apache Spark.☆136Mar 6, 2023Updated 2 years ago
- Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.☆2,304Feb 23, 2026Updated last week
- A data integration framework☆4,110Dec 2, 2025Updated 3 months ago
- ☆235Sep 15, 2022Updated 3 years ago
- Ansible playbooks to help to deploy Apache Hadoop,Spark,Storm,Zookeeper,Elasticsearch,Azkaban,Flume,Hbase,Kafka,Kibana,Logstash☆10Mar 21, 2017Updated 8 years ago
- 基于开源的flink,对其实时sql进行扩展;主要实现了流与维表的join,支持原生flink SQL所有的语法☆2,059Feb 21, 2024Updated 2 years ago
- 酷玩 Spark: Spark 源代码解析、Spark 类库等☆3,482May 18, 2022Updated 3 years ago
- 封装sparkstreaming动态调节batch time(有数据就执行计算); 支持运行过程中增删topic; 封装sparkstreaming 1.6 - kafka 010 用以支持 SSL。☆182Apr 15, 2021Updated 4 years ago
- A sink to save Spark Structured Streaming DataFrame into Hive table☆30Apr 16, 2018Updated 7 years ago
- Scriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, res…☆814Dec 11, 2024Updated last year
- Alerting and monitoring tool for Apache Spark☆23May 20, 2022Updated 3 years ago
- A HBase datasource implementation for Spark and [MLSQL](http://www.mlsql.tech).☆15Sep 29, 2023Updated 2 years ago
- 为DataX(https://github.com/alibaba/DataX) 提供远程多语言调用(ThriftServer,HttpServer) 分布式运行(DataX on YARN) 功能☆144Jan 22, 2026Updated last month
- A RPC framework leveraging Spark RPC module☆209Mar 13, 2019Updated 6 years ago
- ☆19Jun 16, 2021Updated 4 years ago
- Apache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications…☆3,416Feb 12, 2026Updated 2 weeks ago
- Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark☆1,371Aug 22, 2023Updated 2 years ago
- hera 分布式任务调度系统 大数据任务调度系统 任务调度 (数据部门专用)☆373Aug 14, 2023Updated 2 years ago
- A Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources☆2,055Oct 25, 2022Updated 3 years ago
- JVM related exercises☆11Jul 16, 2017Updated 8 years ago
- ServiceFramework 示例项目☆10Apr 2, 2016Updated 9 years ago