apicrafter / metacrafter
Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully customizable and flexible rules
β44Updated 4 months ago
Related projects β
Alternatives and complementary repositories for metacrafter
- Provide an easy way with Python to protect your data sources by searching its metadata. π‘οΈβ17Updated 3 weeks ago
- An experimental Athena extension for DuckDB π€β50Updated 9 months ago
- Ibis analytics, with Ibis (and more!)β19Updated last month
- Registry of metadata identifier entities like UUID, GUID, person fullname, address and so on. Linked with other sourcesβ16Updated 11 months ago
- PyPi module for Graphlet AI Knowledge Graph Factoryβ28Updated last year
- List of entity resolution software and resources.β38Updated 8 months ago
- quadipy is a python package to help transform structured data into RDF graph formatβ18Updated last year
- β21Updated 3 months ago
- A package to build an end-to-end pipeline for detecting personally identifiable information from text.β43Updated 5 years ago
- A collection of python utility functionsβ12Updated 4 months ago
- dagster scikit-learn pipeline example.β43Updated last year
- The Modern Data Stack in a Python packageβ49Updated 11 months ago
- Sord Data Fabric: A Vue 3 frontend with a Python WebSocket server, leveraging a distributed architecture with DeltaLake and DuckDB workerβ¦β18Updated 11 months ago
- Playing with Python Bluesky SDKβ13Updated this week
- undatum: a command-line tool for data processing. Brings CSV simplicity to JSON lines and BSONβ48Updated 2 months ago
- Vinum is a SQL processor for Python, designed for data analysis workflows and in-memory analytics.β65Updated 3 years ago
- Mapping of DWH database tables to business entities, attributes & metrics in Python, with automatic creation of flattened tablesβ71Updated last year
- Implementation of the Cypher language for searching NetworkX graphsβ83Updated this week
- This repo contains information about DuckDB extensions found on GitHub. Refreshed dailyβ82Updated this week
- Data Tools Subjective Listβ80Updated last year
- DuckDB Community Extension to prompt LLMs from SQLβ22Updated last week
- Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observβ¦β111Updated last week
- A SQL parserβ55Updated 2 months ago
- Scripts to make specific datasets cleaner and more convenientβ40Updated last year
- Linear regression in SQL using dbtβ66Updated last month
- Write your dbt models using Ibisβ53Updated last month
- FlockMTL: DuckDB extension to seamlessly combine analytics and semantic analysis using language models (LMs)β67Updated this week
- β63Updated 3 months ago
- Graph Engine for Exploration and Searchβ40Updated 9 months ago
- A write-audit-publish implementation on a data lake without the JVMβ41Updated 3 months ago