apicrafter / metacrafterLinks
Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully customizable and flexible rules
β45Updated 2 months ago
Alternatives and similar repositories for metacrafter
Users that are interested in metacrafter are comparing it to the libraries listed below
Sorting:
- Provide an easy way with Python to protect your data sources by searching its metadata. π‘οΈβ18Updated last week
- Scripts to make specific datasets cleaner and more convenientβ42Updated 2 years ago
- Mapping of DWH database tables to business entities, attributes & metrics in Python, with automatic creation of flattened tablesβ74Updated last year
- dagster scikit-learn pipeline example.β45Updated 2 years ago
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.β62Updated this week
- DataOps Data Quality TestGen is part of DataKitchen's Open Source Data Observability. DataOps TestGen delivers simple, fast data qualitβ¦β63Updated last week
- Application and python script to identify, remove, and/or recode personally identifiable information (PII) from field experiment datasetsβ¦β46Updated 4 years ago
- β23Updated last year
- A SQL parserβ62Updated 5 months ago
- A monorepo of many Rill example projectsβ42Updated last week
- PyPi module for Graphlet AI Knowledge Graph Factoryβ29Updated 2 years ago
- CLI to create an ER Diagram from DuckDB database filesβ135Updated 6 months ago
- Ibis analytics, with Ibis (and more!)β22Updated 11 months ago
- Toolkit for graph-relational data across space and timeβ117Updated last year
- ODD Specification is a universal open standard for collecting metadata.β144Updated 10 months ago
- Python Data Anonymization & Masking Library For Data Science Tasksβ274Updated 2 years ago
- dotML is a light-weight semantic layer written in Python.β38Updated last year
- Data pipelines from re-usable componentsβ107Updated 2 years ago
- Python+VueJS application to load, explore, combine,transform and deliver dataβ97Updated 7 months ago
- Playground for using large language models into the Modern Data Stack for entity matchingβ108Updated 2 years ago
- Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observβ¦β166Updated 2 weeks ago
- List of entity resolution software and resources.β85Updated 7 months ago
- undatum: a command-line tool for data processing. Brings CSV simplicity to NDJSON, BSON, XML and other dat filesβ48Updated last month
- Find Python Packages on PyPI with the help of vector embeddingsβ47Updated 3 months ago
- Pipeline definitions for managing data flows to power analytics at MIT Open Learningβ43Updated this week
- Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).β120Updated last week
- Swiple enables you to easily observe, understand, validate and improve the quality of your dataβ84Updated this week
- π¦ A SQL-on-everything Query Engine you can execute over multiple databases and file formats. Query your data, where it lives.β108Updated this week
- This repo contains information about DuckDB extensions found on GitHub. Refreshed dailyβ101Updated this week
- Data Tools Subjective Listβ87Updated 2 years ago