flyteorg / datacatalog
Data Catalog is a service for indexing parameterized, strongly-typed data artifacts across revisions. It also powers Flytes memoization system
☆54Updated last year
Alternatives and similar repositories for datacatalog:
Users that are interested in datacatalog are comparing it to the libraries listed below
- Control Plane for Flyte. Flyteadmin is a gRPC + REST Service written in golang and uses a RDBMs to store meta information and management …☆39Updated last year
- A apache commons style library in Golang, use by the Flyte project. Contains utilities for metrics, pflags, config management, storage ab…☆60Updated last year
- FlytePropeller is a Kubernetes native operator, that executes Flyte Workflows and Tasks. It has its own kubectl-flyte CLI to interact and…☆47Updated last year
- The Flyte data-sidecar that helps move the input and output data intelligently between containers☆10Updated last year
- Flyte Backend Plugins contributed by the Flyte community.☆28Updated last year
- Specification of the IR for Flyte workflows and tasks. Also Interfaces for all backend services. https://docs.flyte.org/projects/flyteidl…☆28Updated last year
- Opinionated serverless event analytics pipeline☆43Updated last year
- Airbyte is the go-sdk/cdk to help build connectors quickly in go. This package abstracts away much of the "protocol" away from the user a…☆38Updated 11 months ago
- Data Catalog for Databases and Data Warehouses☆32Updated last year
- Export Airflow metrics (from mysql) in prometheus format☆29Updated 2 weeks ago
- Go Client for Hive Metastore☆14Updated 2 years ago
- A temporary home for LinkedIn's changes to Apache Iceberg (incubating)☆62Updated 2 months ago
- Beneath is a serverless real-time data platform ⚡️☆84Updated 3 years ago
- Bullet is a streaming query engine that can be plugged into any singular data stream using a Stream Processing framework like Apache Stor…☆41Updated 2 years ago
- A tool for describing pure data pipelines that enables avoiding repeating work (incrementality) and keeping old data around (provenance)☆71Updated 4 years ago
- The sane way of building a data layer in Airflow☆24Updated 5 years ago
- Dom's Data Build Tool☆69Updated last year
- Trino (f.k.a PrestoSQL) dialect for SQLAlchemy.☆25Updated 2 years ago
- Apache Pinot Golang Client managed by StarTree☆28Updated 10 months ago
- Kubernetes operator providing Ray|Spark|Dask|MPI clusters on-demand☆14Updated last year
- Airflow on Kubernetes Operator☆89Updated 2 years ago
- Altinity Dashboard helps you manage ClickHouse installations controlled by clickhouse-operator.☆66Updated this week
- Connectors for capturing data from external data sources☆57Updated this week
- Highly configurable Helm Presto Chart☆24Updated 5 years ago
- [ARCHIVED] The Presto adapter plugin for dbt Core☆33Updated last year
- Ephemeral Hadoop clusters using Google Compute Platform☆135Updated 2 years ago
- Presto & Alluxio Dockers for blazing fast analytics☆13Updated 5 years ago
- Data ingestion library for Amundsen to build graph and search index☆205Updated 11 months ago
- Dremio Flight connector. Access Dremio using Arrow flight☆40Updated 4 years ago
- Apiary provides modules which can be combined to create a federated cloud data lake☆36Updated 10 months ago