startreedata / thirdeye
ThirdEye is an integrated tool for realtime monitoring of time series and interactive root-cause analysis.
☆97Updated this week
Alternatives and similar repositories for thirdeye:
Users that are interested in thirdeye are comparing it to the libraries listed below
- ThirdEye is an integrated tool for realtime monitoring of time series and interactive root-cause analysis. It enables anyone inside an or…☆92Updated 2 years ago
- The Workload Analyzer collects Presto® and Trino workload statistics, and analyzes them☆135Updated last year
- ☆79Updated last year
- dbt-starrocks contains all of the code enabling dbt to work with StarRocks☆24Updated 3 months ago
- Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observ…☆125Updated 2 weeks ago
- Smart Automation Tool for building modern Data Lakes and Data Pipelines☆114Updated this week
- Multi-hop declarative data pipelines☆107Updated this week
- Timeseries Anomaly detection and Root Cause Analysis on data in SQL data warehouses and databases☆227Updated 2 years ago
- Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pi…☆95Updated 2 weeks ago
- Apache Liminals goal is to operationalise the machine learning process, allowing data scientists to quickly transition from a successful …☆144Updated 6 months ago
- Spark-Dashboard is a solution for monitoring Apache Spark jobs. This repository provides the tooling and configuration for deploying an A…☆118Updated last week
- A portable Pythonic Data Lakehouse powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to …☆175Updated this week
- Low Cost, Simple and Scalable Way of Data Replication to Apache Iceberg/Cloud/Data Lake☆219Updated this week
- This project provides fully automated one-click experience to create Cloud and Kubernetes environment to run Data Analytics workload like…☆53Updated 2 years ago
- Visualize column-level data lineage in Spark SQL☆88Updated 2 years ago
- Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka in JSON or Avro format.☆155Updated 2 months ago
- A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Ap…☆299Updated last year
- CLI tool to bulk migrate the tables from one catalog another without a data copy☆73Updated this week
- ☆65Updated 6 months ago
- Delta reader for the Ray open-source toolkit for building ML applications☆43Updated last year
- ☆47Updated 5 months ago
- DynoYARN is a framework to run simulated YARN clusters and workloads for YARN scale testing.☆58Updated last year
- ☆39Updated 5 years ago
- Multiple node presto cluster on docker container☆124Updated 2 years ago
- Example for article Running Spark 3 with standalone Hive Metastore 3.0☆97Updated 2 years ago
- The Internals of Spark on Kubernetes☆70Updated 2 years ago
- Use SQL to build ELT pipelines on a data lakehouse.☆284Updated 2 years ago
- Data Tools Subjective List☆82Updated last year
- The Trino (https://trino.io/) adapter plugin for dbt (https://getdbt.com)☆223Updated last month
- A Table format agnostic data sharing framework☆38Updated 11 months ago