apicrafter / metacrafterLinks
Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully customizable and flexible rules
β44Updated 11 months ago
Alternatives and similar repositories for metacrafter
Users that are interested in metacrafter are comparing it to the libraries listed below
Sorting:
- Provide an easy way with Python to protect your data sources by searching its metadata. π‘οΈβ17Updated last month
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.β59Updated 2 weeks ago
- Registry of metadata identifier entities like UUID, GUID, person fullname, address and so on. Linked with other sourcesβ17Updated last year
- List of entity resolution software and resources.β75Updated 4 months ago
- β48Updated last week
- Sord Data Fabric: A Vue 3 frontend with a Python WebSocket server, leveraging a distributed architecture with DeltaLake and DuckDB workerβ¦β18Updated last year
- A fully-featured multi-source data pipeline for continuously extracting knowledge from COVID-19 data.β21Updated 4 years ago
- Ibis analytics, with Ibis (and more!)β22Updated 9 months ago
- An experimental Athena extension for DuckDB π€β54Updated 5 months ago
- Playground for using large language models into the Modern Data Stack for entity matchingβ108Updated 2 years ago
- Application and python script to identify, remove, and/or recode personally identifiable information (PII) from field experiment datasetsβ¦β46Updated 3 years ago
- β22Updated 10 months ago
- A package to build an end-to-end pipeline for detecting personally identifiable information from text.β45Updated 6 years ago
- A small Python module containing quick utility functions for standard ETL processes.β35Updated this week
- Python package for deduplication/entity resolution using active learningβ80Updated 10 months ago
- Mapping of DWH database tables to business entities, attributes & metrics in Python, with automatic creation of flattened tablesβ75Updated last year
- Data Tools Subjective Listβ83Updated last year
- Discover the simplicity and strength of Duckdb, dbt, and Iceberg in this project. Create an efficient, versatile data analytics solution β¦β34Updated last year
- dotML is a light-weight semantic layer written in Python.β36Updated last year
- Graph Engine for Exploration and Searchβ42Updated last year
- Data pipelines from re-usable componentsβ108Updated 2 years ago
- π¦ A SQL-on-everything Query Engine you can execute over multiple databases and file formats. Query your data, where it lives.β105Updated this week
- β42Updated this week
- Playing with Python Bluesky SDKβ15Updated 7 months ago
- Lossless in-memory compression of pandas DataFrames and Series powered by the visions type system. Up to 10x less RAM needed for the sameβ¦β29Updated 2 years ago
- portable Python ML-powered data botβ23Updated 8 months ago
- A collection of python utility functionsβ11Updated 11 months ago
- β39Updated 4 months ago
- Anomstack - Painless open source anomaly detection for your metrics πππβ102Updated last week
- quadipy is a python package to help transform structured data into RDF graph formatβ19Updated 2 years ago