treeverse / lakeviewLinks
lakeview is a visibility tool for S3 based data lakes
☆29Updated 6 months ago
Alternatives and similar repositories for lakeview
Users that are interested in lakeview are comparing it to the libraries listed below
Sorting:
- Python package for querying iceberg data through duckdb.☆72Updated last year
- Packaging DuckDB for Node.js Lambda functions. Example application: https://github.com/tobilg/serverless-duckdb☆150Updated this week
- Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka in JSON or Avro format.☆169Updated 4 months ago
- The Amazon S3 Tables catalog is a client library that bridges control plane operations provided by S3 Tables to engines like Apache Spark…☆147Updated last week
- Schema modelling framework for decentralised domain-driven ownership of data.☆261Updated 2 years ago
- ☆58Updated last month
- Amundsen Gremlin☆22Updated 3 years ago
- Multi-hop declarative data pipelines☆124Updated 2 weeks ago
- A DuckDB-powered command line interface for Snowflake security, governance, operations, and cost optimization.☆41Updated last year
- Pylint plugin for static code analysis on Airflow code☆97Updated 5 years ago
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes☆63Updated 3 years ago
- [ARCHIVED] The Presto adapter plugin for dbt Core☆32Updated 2 years ago
- Security Analytics Using The Snowflake Data Warehouse☆184Updated 2 months ago
- Pushdown compute from Snowflake to DuckDB running on your infrastructure☆203Updated 3 months ago
- Serverless multi-protocol + multi-destination event collection system.☆210Updated last year
- ☆30Updated last year
- A Table format agnostic data sharing framework☆42Updated 2 years ago
- Boto S3 Router provides a Boto3-like client that routes requests between S3 clients according to the bucket and the key in the request.☆18Updated 3 years ago
- 🚀 GizmoSQL — High-Performance SQL Server☆282Updated this week
- Superglue is a lineage-tracking tool built to help visualize the propagation of data through complex pipelines composed of tables, jobs …☆160Updated 3 years ago
- Serverless HTAP cloud data platform powered by Arrow × DuckDB × Iceberg☆332Updated 2 years ago
- A playground for running duckdb as a stateless query engine over a data lake☆218Updated 2 years ago
- BigQuery Google Storage Based Data Loader☆57Updated 9 months ago
- Work with your web service, database, and streaming schemas in a single format.☆350Updated last month
- Sample code to collect Apache Iceberg metrics for table monitoring☆29Updated last year
- Export Redshift data and convert to Parquet for use with Redshift Spectrum or other data warehouses.☆117Updated 3 years ago
- A Spark-based data comparison tool at scale which facilitates software development engineers to compare a plethora of pair combinations o…☆52Updated 7 months ago
- Faker for Snowflake!☆33Updated 3 years ago
- Continuously synchronize directories from remote object store to local filesystem☆109Updated last week
- Fast iterative local development and testing of Apache Airflow workflows☆204Updated last month