treeverse / lakeviewLinks
lakeview is a visibility tool for S3 based data lakes
☆29Updated last month
Alternatives and similar repositories for lakeview
Users that are interested in lakeview are comparing it to the libraries listed below
Sorting:
- A Spark-based data comparison tool at scale which facilitates software development engineers to compare a plethora of pair combinations o…☆52Updated 2 months ago
- Bender - Serverless ETL Framework☆188Updated last year
- Benchmark data warehouses under Fivetran-like conditions☆170Updated 2 years ago
- ETLy is an add-on dashboard service on top of Apache Airflow.☆68Updated 2 years ago
- Multi-hop declarative data pipelines☆118Updated last week
- A CLI to manage and monitor permissions in AWS Lake Formation☆26Updated 2 years ago
- Herd-UI is a search and discovery tool for business and technical users. Everyone in your organization can use Herd-UI to browse and unde…☆16Updated 2 years ago
- Export Redshift data and convert to Parquet for use with Redshift Spectrum or other data warehouses.☆117Updated 2 years ago
- Schema modelling framework for decentralised domain-driven ownership of data.☆257Updated last year
- The Amazon S3 Tables catalog is a client library that bridges control plane operations provided by S3 Tables to engines like Apache Spark…☆137Updated 3 weeks ago
- Amundsen Gremlin☆21Updated 3 years ago
- Presto-like CLI tool for AWS Athena☆84Updated 2 years ago
- Sample code to collect Apache Iceberg metrics for table monitoring☆28Updated last year
- The open source version of the Amazon Redshift Cluster Management Guide.☆48Updated 2 years ago
- Faker for Snowflake!☆33Updated 2 years ago
- Spark runtime on AWS Lambda☆109Updated last week
- This repository contains the dbt-glue adapter☆131Updated this week
- Superglue is a lineage-tracking tool built to help visualize the propagation of data through complex pipelines composed of tables, jobs …☆158Updated 2 years ago
- The elegance of Airflow + the power of AWS☆51Updated last year
- Python package for querying iceberg data through duckdb.☆70Updated last year
- Continuously synchronize directories from remote object store to local filesystem☆106Updated 6 months ago
- Apiary provides modules which can be combined to create a federated cloud data lake☆36Updated last year
- Curated list of resources about Apache Airflow☆19Updated 4 years ago
- The sane way of building a data layer in Airflow☆24Updated 5 years ago
- Automated data quality suggestions and analysis with Deequ on AWS Glue☆87Updated 2 years ago
- Herd is a managed data lake for the cloud. The Herd unified data catalog helps separate storage from compute in the cloud. Manage petabyt…☆138Updated 2 years ago
- ☆148Updated 7 months ago
- Metadata service library for Amundsen☆83Updated last month
- Packaging DuckDB for Node.js Lambda functions. Example application: https://github.com/tobilg/serverless-duckdb☆142Updated last month
- Security Analytics Using The Snowflake Data Warehouse☆184Updated 2 months ago