snowplow / stream-collectorLinks
Collector for cloud-native web, mobile and event analytics, running on AWS and GCP
☆33Updated 2 weeks ago
Alternatives and similar repositories for stream-collector
Users that are interested in stream-collector are comparing it to the libraries listed below
Sorting:
- Snowplow Enrichment jobs and library☆25Updated last week
- Adapter for dbt that executes dbt pipelines on Apache Flink☆95Updated last year
- Kafka Connector for Iceberg tables☆16Updated 2 years ago
- SparkSQL utils for ScalaPB☆43Updated 2 months ago
- A library that provides useful extensions to Apache Spark and PySpark.☆229Updated last month
- a curated list of awesome lakehouse frameworks, applications, etc☆34Updated 6 months ago
- Stores Snowplow enriched events in Redshift, Snowflake and Databricks☆30Updated 4 months ago
- Snowflake Data Source for Apache Spark.☆229Updated this week
- ☆80Updated 4 months ago
- ⚠️ MAINTENANCE-ONLY MODE: Snowplow maintained SQL data models for working with Snowplow web and mobile behavioral data.☆42Updated 7 months ago
- Multi-hop declarative data pipelines☆118Updated this week
- A dynamic data completeness and accuracy library at enterprise scale for Apache Spark☆29Updated 9 months ago
- Superglue is a lineage-tracking tool built to help visualize the propagation of data through complex pipelines composed of tables, jobs …☆158Updated 2 years ago
- A Spark UI and Spark History Server alternative with CPU and Memory metrics! Delight is free, cross-platform, and open-source.☆344Updated last year
- Code and examples of how to write and deploy Apache Spark Plugins. Spark plugins allow runnig custom code on the executors as they are in…☆91Updated 3 months ago
- dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks☆441Updated last month
- Library to convert DBT manifest metadata to Airflow tasks☆48Updated last year
- Dashboard for operating Flink jobs and deployments.☆39Updated this week
- Smart Automation Tool for building modern Data Lakes and Data Pipelines☆124Updated last week
- Extensible streaming ingestion pipeline on top of Apache Spark☆45Updated last month
- A tool to validate data, built around Apache Spark.☆100Updated last week
- A library that brings useful functions from various modern database management systems to Apache Spark☆60Updated last year
- A dbt adapter for Decodable☆12Updated 6 months ago
- Avro SerDe for Apache Spark structured APIs.☆234Updated 2 months ago
- BigQuery connector for Apache Flink☆32Updated last month
- Low Cost, Simple and Scalable Way of Data Replication to Apache Iceberg/Cloud/Data Lake☆279Updated this week
- Reference implementation for real-time Data Lineage tracking for BigQuery using Audit Logs, ZetaSQL and Dataflow.☆147Updated last year
- Replicates any database (CDC events) to Bigquery in real time☆22Updated last week
- ☆22Updated 6 years ago
- Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive☆185Updated 2 years ago