DBeam exports SQL tables into Avro files using JDBC and Apache Beam
☆196May 13, 2026Updated last week
Alternatives and similar repositories for dbeam
Users that are interested in dbeam are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- GCS support for avro-tools, parquet-tools and protobuf☆79May 5, 2025Updated last year
- Ephemeral Hadoop clusters using Google Compute Platform☆136Mar 31, 2022Updated 4 years ago
- "The path to execution", Styx is a service that schedules batch data processing jobs in Docker containers on Kubernetes.☆271Jul 12, 2023Updated 2 years ago
- A unified way of launching Dataflow jobs☆13Apr 17, 2023Updated 3 years ago
- Useful Cloud Dataflow custom templates.☆16Dec 14, 2022Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆54Aug 3, 2017Updated 8 years ago
- A Scala feature transformation library for data science and machine learning☆474Feb 7, 2025Updated last year
- ☆67Aug 16, 2024Updated last year
- Scala Aggregators used for ML Model metrics monitoring☆92Sep 13, 2023Updated 2 years ago
- protoc-gen-bq-schema helps you to send your Protocol Buffer messages to BigQuery.☆264Oct 29, 2025Updated 6 months ago
- Repository with examples and smoke tests for the GCP Airflow operators and hooks☆152Jan 15, 2017Updated 9 years ago
- A PyPI compatible server running on App Engine☆12Nov 13, 2017Updated 8 years ago
- Code snippets for solving common big data problems in various platforms. Inspired by Rosetta Code☆297Jan 31, 2025Updated last year
- Apache Beam is a unified programming model for Batch and Streaming data processing.☆8,582May 14, 2026Updated last week
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆80Nov 10, 2023Updated 2 years ago
- A wrapper for Hadoop in Scala☆42Jul 18, 2010Updated 15 years ago
- Building Scio from scratch step by step☆20May 20, 2019Updated 7 years ago
- Flyte Flink k8s plugin.☆20Apr 23, 2026Updated 3 weeks ago
- A tool for data sampling, data generation, and data diffing☆350Mar 31, 2026Updated last month
- Yet-Another-Rules-Engine -- A easy-to-understand Business Readable DSL for defining production rules.☆14Mar 24, 2021Updated 5 years ago
- Iceberg is a table format for large, slow-moving tabular data☆493Apr 10, 2023Updated 3 years ago
- TensorFlow TFRecord reader CLI tool☆62Dec 15, 2025Updated 5 months ago
- A lightweight workflow definition library☆156Jul 15, 2022Updated 3 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Powerful framework providing many useful utilities and features on top of the Scala language.☆15Feb 8, 2017Updated 9 years ago
- This repo contains the LookML for the model and dashboards used with the FHIR healthcare dataset to showcase how Looker can add value to …☆14Jan 5, 2023Updated 3 years ago
- AngularJS directives for dojo widgets☆33Apr 18, 2014Updated 12 years ago
- Mirror of Apache livy (Incubating)☆13Updated this week
- Quark is a data virtualization engine over analytic databases.☆101Jul 13, 2017Updated 8 years ago
- A collection of Magnolia add-on modules☆182Feb 12, 2026Updated 3 months ago
- Micro-utilities for Java☆14Jul 25, 2022Updated 3 years ago
- Experiments in Streaming☆60Aug 27, 2016Updated 9 years ago
- HDFS inotify Example☆22Feb 8, 2023Updated 3 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- In-deprecation. For Lenses please check lensesio/lenses-helm-charts. Soon Stream Reactor will also get its own Helm repository.☆70Aug 2, 2020Updated 5 years ago
- Golang kafka client based on libkafka☆11Mar 4, 2026Updated 2 months ago
- DEPRECATED. PLEASE USE https://github.com/confluentinc/kafka-connect-bigquery. A Kafka Connect BigQuery sink connector☆152Mar 4, 2024Updated 2 years ago
- Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines. This re…☆166Jul 25, 2018Updated 7 years ago
- syslog module for nginx☆18Sep 19, 2010Updated 15 years ago
- Dataform is a framework for managing SQL based data operations in BigQuery☆977May 13, 2026Updated last week
- R dplyr connector for ImpalaDB☆15Mar 1, 2017Updated 9 years ago