sqlparser / python_data_lineage
Data lineage tools in python
☆18Updated 7 months ago
Related projects: ⓘ
- Data Catalog for Databases and Data Warehouses☆31Updated 8 months ago
- Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observ…☆102Updated this week
- ☆39Updated last month
- Dremio Flight connector. Access Dremio using Arrow flight☆40Updated 3 years ago
- dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.☆57Updated 2 years ago
- A library that brings useful functions from various modern database management systems to Apache Spark☆53Updated last year
- Lightweight configuration and access to multiple databases in a single project☆38Updated 9 months ago
- Parallel Streaming Transformation Loader☆9Updated 5 years ago
- IPython magics to work with DBT☆14Updated 2 years ago
- ☆22Updated 2 years ago
- Inspect Your Servers with DuckDB☆28Updated last year
- A platform to manage the data product life cycle☆11Updated last month
- Test data management tool for any data source, batch or real-time☆35Updated last week
- A curated list of awesome PrestoDB / Trino software, libraries, tools and resources☆16Updated 3 years ago
- A serverless duckDB deployment at GCP☆34Updated 2 years ago
- Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.html☆60Updated last year
- A curated list of awesome open source tools and commercial products to catalog, version, and manage data 🚀☆23Updated 2 years ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆28Updated 2 weeks ago
- A cloud native data mesh implementation☆12Updated 3 years ago
- A Data Mesh demo repository☆12Updated 9 months ago
- This repository is no longer maintained.☆15Updated 2 years ago
- DuckDB Docker image☆18Updated this week
- minio as local storage and DynamoDB as catalog☆10Updated 4 months ago
- An implementation of the DatasourceV2 interface of Apache Spark™ for writing Spark Datasets to Apache Druid™.☆41Updated last week
- ☆20Updated last month
- Provide functionality to build statistical models to repair dirty tabular data in Spark☆12Updated last year
- Amundsen Gremlin☆20Updated 2 years ago
- ☆26Updated 4 months ago
- The sane way of building a data layer in Airflow☆24Updated 4 years ago
- Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.☆45Updated last week