Scalable CDC Pattern Implemented using PySpark
☆18Oct 8, 2025Updated 4 months ago
Alternatives and similar repositories for cdc-at-scale-using-spark
Users that are interested in cdc-at-scale-using-spark are comparing it to the libraries listed below
Sorting:
- Multi-stage, config driven, SQL based ETL framework using PySpark☆26Sep 16, 2019Updated 6 years ago
- Implementation of a Big Data (batch and stream) distributed processing engine in Java using Akka actors.☆12Feb 20, 2023Updated 3 years ago
- Showing the relationship between ImageNet ID and labels and pytorch pre-trained model output ID and labels☆10Oct 11, 2020Updated 5 years ago
- Mirror of Apache MetaModel Membrane☆16Jun 4, 2019Updated 6 years ago
- Nested Data (JSON/AVRO/XML) Parsing and Flattening in Spark☆16Jan 22, 2024Updated 2 years ago
- Flink Hadoop Compatibility + Elasticsearch for Apache Hadoop = Flink Connector Elasticsearch Source Table。结合flink+hadoop+es 实现的es table s…☆19Jun 28, 2021Updated 4 years ago
- ☆22Jul 2, 2025Updated 8 months ago
- Capture the logical plan from Spark (SQL)☆22Mar 6, 2021Updated 4 years ago
- An experiment to inject a customized parser using SparkSessionExtension☆16Jan 1, 2018Updated 8 years ago
- Query and Provision Cloud Infrastructure using an extensible SQL based grammar☆25Apr 5, 2022Updated 3 years ago
- SparkStreaming中利用MySQL保存Kafka偏移量保证0数据丢失☆44Aug 2, 2017Updated 8 years ago
- A minimal seed template for an Akka gRPC with Scala build☆19Jan 22, 2026Updated last month
- How to manage Slowly Changing Dimensions with Apache Hive☆55Aug 27, 2019Updated 6 years ago
- Grafana's table plugin for ClickHouse☆26Jul 7, 2022Updated 3 years ago
- XML for Analysis (XMLA) server based upon an olap4j connection☆23Dec 8, 2016Updated 9 years ago
- Helps control infra costs by pointing potential unused zombie Google Cloud Platform projects.☆11Dec 28, 2023Updated 2 years ago
- Scala utility to send mail☆14May 4, 2020Updated 5 years ago
- A repository that includes examples from Spanish posts☆10Dec 19, 2025Updated 2 months ago
- Cloud-based SQL engine using SPARK where data is accessible as JDBC/ODBC data source via Spark ThriftServer.☆31Jul 12, 2017Updated 8 years ago
- ☆12Updated this week
- A web server to generate ER diagrams☆34Mar 11, 2019Updated 6 years ago
- This is a complete suite of spring boot couchbase and kafka☆12Dec 10, 2018Updated 7 years ago
- CODO is an ontology for the semantic representation and annotation of COVID-19 data in a machine-readable form for tracking history of th…☆10Apr 19, 2022Updated 3 years ago
- Visual tool for SPARQL queries on graphol graphs☆10Oct 3, 2018Updated 7 years ago
- Maintenance Information Extraction (MaintIE)☆16Jun 29, 2024Updated last year
- A big data cluster management tool that creates and manages clusters of different technologies.☆21Apr 20, 2015Updated 10 years ago
- An implementation of the DatasourceV2 interface of Apache Spark™ for writing Spark Datasets to Apache Druid™.☆43Updated this week
- A visual ETL development and debugging tool for big data☆156Dec 5, 2022Updated 3 years ago
- Apache Calcite Tutorial☆33Jun 24, 2016Updated 9 years ago
- ☆10Aug 13, 2021Updated 4 years ago
- Code samples, summaries, cheatsheets and other study material for Hadoop MapReduce and Apache Spark☆10Aug 17, 2018Updated 7 years ago
- My branch of Apache Flume with a generic JDBC sink (not yet licensed to Apache)☆11Feb 12, 2022Updated 4 years ago
- Bridge to MetaTrader4 over ODBC interface☆18Aug 29, 2011Updated 14 years ago
- Demo repository to lambda-fy your dbt runs☆11Sep 7, 2023Updated 2 years ago
- seckill秒杀项目【PRC】