apicrafter / metacrafterLinks
Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully customizable and flexible rules
☆45Updated 2 months ago
Alternatives and similar repositories for metacrafter
Users that are interested in metacrafter are comparing it to the libraries listed below
Sorting:
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆63Updated last week
- Python+VueJS application to load, explore, combine,transform and deliver data☆97Updated 7 months ago
- Python Data Anonymization & Masking Library For Data Science Tasks☆274Updated 2 years ago
- PyPi module for Graphlet AI Knowledge Graph Factory☆30Updated 2 years ago
- List of entity resolution software and resources.☆88Updated 7 months ago
- Application and python script to identify, remove, and/or recode personally identifiable information (PII) from field experiment datasets…☆46Updated 2 weeks ago
- CLI to create an ER Diagram from DuckDB database files☆135Updated 7 months ago
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆114Updated last year
- Swiple enables you to easily observe, understand, validate and improve the quality of your data☆84Updated this week
- Data pipelines from re-usable components☆107Updated 2 years ago
- Anomstack - Painless open source anomaly detection for your metrics 📈📉🚀☆106Updated 2 weeks ago
- Embedded MonetDB with a Python frontend and fast Numpy/Pandas support☆63Updated last year
- Provide an easy way with Python to protect your data sources by searching its metadata. 🛡️☆18Updated last week
- dotML is a light-weight semantic layer written in Python.☆39Updated last year
- Ibis analytics, with Ibis (and more!)☆22Updated last year
- Playground for using large language models into the Modern Data Stack for entity matching☆108Updated 2 years ago
- Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observ…☆167Updated last month
- Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).☆120Updated 3 weeks ago
- 🦖 A SQL-on-everything Query Engine you can execute over multiple databases and file formats. Query your data, where it lives.☆108Updated last week
- This repo contains information about DuckDB extensions found on GitHub. Refreshed daily☆102Updated this week
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise it☆26Updated last year
- Linear regression in SQL using dbt☆75Updated 9 months ago
- Playing with Python Bluesky SDK☆15Updated 10 months ago
- Scripts to make specific datasets cleaner and more convenient☆42Updated 2 years ago
- Mapping of DWH database tables to business entities, attributes & metrics in Python, with automatic creation of flattened tables☆74Updated last year
- Data Tools Subjective List☆86Updated 2 years ago
- Opinionated JSON to CSV/XLSX/SQLITE/PARQUET converter. Flattens JSON fast.☆198Updated 3 months ago
- A playground for running duckdb as a stateless query engine over a data lake☆211Updated last year
- Write python locally, execute SQL in your data warehouse☆269Updated 3 years ago
- Convert monolithic Jupyter notebooks 📙 into maintainable Ploomber pipelines. 📊☆79Updated last year