apicrafter / metacrafterLinks
Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully customizable and flexible rules
β44Updated last year
Alternatives and similar repositories for metacrafter
Users that are interested in metacrafter are comparing it to the libraries listed below
Sorting:
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.β59Updated last month
- Provide an easy way with Python to protect your data sources by searching its metadata. π‘οΈβ17Updated 2 months ago
- Anomstack - Painless open source anomaly detection for your metrics πππβ103Updated last week
- Batteries included toolkit for data engineering.β34Updated 6 months ago
- ODD Specification is a universal open standard for collecting metadata.β142Updated 8 months ago
- Scripts to make specific datasets cleaner and more convenientβ41Updated 2 years ago
- Application and python script to identify, remove, and/or recode personally identifiable information (PII) from field experiment datasetsβ¦β46Updated 3 years ago
- dotML is a light-weight semantic layer written in Python.β36Updated last year
- Mapping of DWH database tables to business entities, attributes & metrics in Python, with automatic creation of flattened tablesβ75Updated last year
- Data Tools Subjective Listβ86Updated last year
- Python+VueJS application to load, explore, combine,transform and deliver dataβ94Updated 4 months ago
- Registry of metadata identifier entities like UUID, GUID, person fullname, address and so on. Linked with other sourcesβ17Updated last year
- Playground for using large language models into the Modern Data Stack for entity matchingβ108Updated 2 years ago
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise itβ26Updated last year
- β116Updated 2 years ago
- β49Updated last month
- Data pipelines from re-usable componentsβ108Updated 2 years ago
- List of entity resolution software and resources.β77Updated 4 months ago
- A Python library to generate static data catalog sites. Carte scrapes metadata from your data assets and generates a fully searchable froβ¦β27Updated 3 years ago
- quadipy is a python package to help transform structured data into RDF graph formatβ19Updated 2 years ago
- Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observβ¦β154Updated this week
- A SQL parserβ62Updated 3 months ago
- PyPi module for Graphlet AI Knowledge Graph Factoryβ29Updated 2 years ago
- β75Updated 4 months ago
- undatum: a command-line tool for data processing. Brings CSV simplicity to NDJSON, BSON, XML and other dat filesβ48Updated last week
- β22Updated 10 months ago
- Python package for deduplication/entity resolution using active learningβ81Updated 10 months ago
- Find Python Packages on PyPI with the help of vector embeddingsβ47Updated last month
- This repo contains information about DuckDB extensions found on GitHub. Refreshed dailyβ96Updated last week
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withouβ¦β113Updated last year