apicrafter / metacrafterLinks
Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully customizable and flexible rules
☆45Updated 3 months ago
Alternatives and similar repositories for metacrafter
Users that are interested in metacrafter are comparing it to the libraries listed below
Sorting:
- List of entity resolution software and resources.☆94Updated 8 months ago
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆62Updated this week
- ODD Specification is a universal open standard for collecting metadata.☆144Updated last year
- Provide an easy way with Python to protect your data sources by searching its metadata. 🛡️☆18Updated last week
- Swiple enables you to easily observe, understand, validate and improve the quality of your data☆84Updated this week
- PyPi module for Graphlet AI Knowledge Graph Factory☆32Updated 2 years ago
- Data pipelines from re-usable components☆107Updated 2 years ago
- Anomstack - Painless open source anomaly detection for your metrics 📈📉🚀☆106Updated last month
- Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observ…☆171Updated 3 weeks ago
- 🚕 A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF, Spark and Vaex)☆141Updated 2 years ago
- CLI to create an ER Diagram from DuckDB database files☆137Updated 7 months ago
- Playground for using large language models into the Modern Data Stack for entity matching☆108Updated 2 years ago
- Python+VueJS application to load, explore, combine,transform and deliver data☆99Updated 8 months ago
- Playing with Python Bluesky SDK☆15Updated 11 months ago
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆114Updated last week
- ☆52Updated last week
- Toolkit for graph-relational data across space and time☆117Updated last year
- Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).☆120Updated last month
- Ibis analytics, with Ibis (and more!)☆22Updated last year
- A small Python module containing quick utility functions for standard ETL processes.☆37Updated this week
- undatum: a command-line tool for data processing. Brings CSV simplicity to NDJSON, BSON, XML and other dat files☆48Updated 3 months ago
- ☆116Updated 2 years ago
- dotML is a light-weight semantic layer written in Python.☆39Updated 2 years ago
- dagster scikit-learn pipeline example.☆46Updated 2 years ago
- ✨ Build dashboards with end-to-end version control. 🔋 CLI w/ batteries included, no infra required. Develop on your laptop for instant r…☆84Updated last week
- Mapping of DWH database tables to business entities, attributes & metrics in Python, with automatic creation of flattened tables☆74Updated last year
- Python package for deduplication/entity resolution using active learning☆82Updated last year
- A SQL parser☆62Updated last month
- A package to build an end-to-end pipeline for detecting personally identifiable information from text.☆48Updated 6 years ago
- DataOps Data Quality TestGen is part of DataKitchen's Open Source Data Observability. DataOps TestGen delivers simple, fast data qualit…☆65Updated 3 weeks ago