apicrafter / metacrafter
Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully customizable and flexible rules
☆44Updated 9 months ago
Alternatives and similar repositories for metacrafter:
Users that are interested in metacrafter are comparing it to the libraries listed below
- quadipy is a python package to help transform structured data into RDF graph format☆19Updated 2 years ago
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆57Updated last week
- Ibis analytics, with Ibis (and more!)☆21Updated 7 months ago
- List of entity resolution software and resources.☆63Updated 2 months ago
- Provide an easy way with Python to protect your data sources by searching its metadata. 🛡️☆16Updated 2 weeks ago
- Batteries included toolkit for data engineering.☆34Updated 3 months ago
- dagster scikit-learn pipeline example.☆44Updated 2 years ago
- Data Tools Subjective List☆83Updated last year
- A collection of python utility functions☆11Updated 9 months ago
- A fully-featured multi-source data pipeline for continuously extracting knowledge from COVID-19 data.☆21Updated 3 years ago
- Mapping of DWH database tables to business entities, attributes & metrics in Python, with automatic creation of flattened tables☆73Updated last year
- Python+VueJS application to load, explore, combine,transform and deliver data☆91Updated 2 months ago
- dotML is a light-weight semantic layer written in Python.☆34Updated last year
- Data Catalog for Databases and Data Warehouses☆34Updated last year
- Python package for deduplication/entity resolution using active learning☆78Updated 8 months ago
- Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observ…☆141Updated 3 weeks ago
- A tool to automatically infer columns data types in .csv files☆35Updated 2 years ago
- portable Python ML-powered data bot☆23Updated 6 months ago
- CLI to create an ER Diagram from DuckDB database files☆119Updated last month
- ☆69Updated 2 months ago
- ☆21Updated 8 months ago
- Graph Engine for Exploration and Search☆40Updated last year
- The Modern Data Stack in a Python package☆49Updated last year
- A simple and easy to use Data Quality (DQ) tool built with Python.☆50Updated last year
- SQL query executor on remote DuckDB instance using Apache Arrow Flight RPC through Streamlit Web interface.☆12Updated 5 months ago
- 🦖 A SQL-on-everything Query Engine you can execute over multiple databases and file formats. Query your data, where it lives.☆104Updated this week
- Playground for using large language models into the Modern Data Stack for entity matching☆107Updated 2 years ago
- Registry of metadata identifier entities like UUID, GUID, person fullname, address and so on. Linked with other sources☆17Updated last year
- A monorepo of many Rill example projects☆36Updated this week
- A small Python module containing quick utility functions for standard ETL processes.☆35Updated last week