apicrafter / metacrafter
Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully customizable and flexible rules
β43Updated 2 months ago
Related projects: β
- Provide an easy way with Python to protect your data sources by searching its metadata. π‘οΈβ17Updated this week
- Registry of metadata identifier entities like UUID, GUID, person fullname, address and so on. Linked with other sourcesβ16Updated 9 months ago
- quadipy is a python package to help transform structured data into RDF graph formatβ18Updated last year
- List of entity resolution software and resources.β31Updated 6 months ago
- PyPi module for Graphlet AI Knowledge Graph Factoryβ28Updated last year
- Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observβ¦β102Updated this week
- Playground for using large language models into the Modern Data Stack for entity matchingβ105Updated last year
- dagster scikit-learn pipeline example.β43Updated last year
- Pipeline definitions for managing data flows to power analytics at MIT Open Learningβ36Updated this week
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.β51Updated 3 weeks ago
- A tool to automatically infer columns data types in .csv filesβ33Updated last year
- β28Updated 9 months ago
- A collection of python utility functionsβ12Updated 2 months ago
- Implementation of the Cypher language for searching NetworkX graphsβ77Updated 3 months ago
- Data Tools Subjective Listβ80Updated last year
- Python package for deduplication/entity resolution using active learningβ77Updated 3 weeks ago
- Record matching and entity resolution at scale in Sparkβ31Updated 10 months ago
- Anomstack - Painless open source anomaly detection for your metrics πππβ86Updated 5 months ago
- Ibis analytics, with Ibis (and more!)β19Updated this week
- Graph Engine for Exploration and Searchβ39Updated 7 months ago
- S3 vector database for LLM Agents and RAG.β28Updated last year
- Mapping of DWH database tables to business entities, attributes & metrics in Python, with automatic creation of flattened tablesβ70Updated 10 months ago
- A curated list of dagster code snippets for data engineersβ48Updated 6 months ago
- β60Updated last month
- Cymple - a productivity tool for creating Cypher queries in Pythonβ44Updated 2 months ago
- Write your dbt models using Ibisβ47Updated 4 months ago
- A Dagster plugin that allows you to run Meltano in Dagsterβ42Updated 5 months ago
- A package to build an end-to-end pipeline for detecting personally identifiable information from text.β43Updated 5 years ago
- A Python library to generate static data catalog sites. Carte scrapes metadata from your data assets and generates a fully searchable froβ¦β26Updated 2 years ago
- portable Python ML-powered data botβ23Updated 5 months ago