tokern / piicatcher
Scan databases and data warehouses for PII data. Tag tables and columns in data catalogs like Amundsen and Datahub
☆272Updated 8 months ago
Related projects: ⓘ
- Generate and Visualize Data Lineage from query history☆309Updated last year
- Data Tools Subjective List☆80Updated last year
- Open source data observability platform☆320Updated last year
- Auto-generated Diagrams from Airflow DAGs. 🔮 🪄☆315Updated this week
- Schema modelling framework for decentralised domain-driven ownership of data.☆244Updated 9 months ago
- The metrics layer for your data. Join us at https://metriql.com/slack☆295Updated last year
- PyAirbyte brings the power of Airbyte to every Python developer.☆205Updated this week
- Sample configuration to deploy a modern data platform.☆84Updated 2 years ago
- Open Control Plane for Tables in Data Lakehouse☆289Updated this week
- Generate authentic looking mock data based on a SQL, JSON or Avro schema and produce to Kafka in JSON or Avro format.☆141Updated 2 weeks ago
- A curated list of awesome blogs, videos, tools and resources about Data Contracts☆158Updated last month
- Metamapper is a data discovery and documentation platform for improving how teams understand and interact with their data.☆76Updated this week
- A portable Datamart and Business Intelligence suite built with Docker, Dagster, dbt, DuckDB, PostgreSQL and Superset☆163Updated last week
- ☆60Updated last month
- A DuckDB-powered command line interface for Snowflake security, governance, operations, and cost optimization.☆35Updated last month
- Making DAG construction easier☆237Updated last week
- Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.html☆60Updated last year
- Make dbt docs and Apache Superset talk to one another☆132Updated 3 weeks ago
- Tool to automate data quality checks on data pipelines☆246Updated 2 years ago
- re_data - fix data issues before your users & CEO would discover them 😊☆98Updated 4 months ago
- A web API for dbt.☆110Updated 7 months ago
- The Trino (https://trino.io/) adapter plugin for dbt (https://getdbt.com)☆207Updated this week
- locopy: Loading/Unloading to Redshift and Snowflake using Python.☆104Updated last week
- A playground for running duckdb as a stateless query engine over a data lake☆156Updated 8 months ago
- Work with your web service, database, and streaming schemas in a single format.☆323Updated 5 months ago
- A CLI tool to streamline getting started with Apache Airflow™ and managing multiple Airflow projects☆187Updated this week
- 📙 Awesome Data Catalogs and Observability Platforms.☆677Updated last month
- SIEM-to-Spark Transpiler☆42Updated 6 months ago
- New generation opensource data stack☆60Updated 2 years ago
- Data policy IN, dynamic view OUT: PACE is the Policy As Code Engine. It helps you to programatically create and apply a data policy to a …☆33Updated this week