apicrafter / metacrafterLinks
Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully customizable and flexible rules
☆45Updated last week
Alternatives and similar repositories for metacrafter
Users that are interested in metacrafter are comparing it to the libraries listed below
Sorting:
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.☆65Updated last week
- Scripts to make specific datasets cleaner and more convenient☆42Updated 3 years ago
- ODD Specification is a universal open standard for collecting metadata.☆145Updated last year
- ☆116Updated 2 years ago
- A SQL parser☆62Updated this week
- Provide an easy way with Python to protect your data sources by searching its metadata. 🛡️☆18Updated last month
- Opinionated JSON to CSV/XLSX/SQLITE/PARQUET converter. Flattens JSON fast.☆204Updated 6 months ago
- PyPi module for Graphlet AI Knowledge Graph Factory☆33Updated 2 years ago
- Data pipelines from re-usable components☆107Updated last month
- CLI to create an ER Diagram from DuckDB database files☆144Updated 9 months ago
- List of entity resolution software and resources.☆103Updated 10 months ago
- dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.☆57Updated 3 years ago
- Anomstack - Painless open source anomaly detection for your metrics 📈📉🚀☆107Updated 2 months ago
- Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observ…☆178Updated this week
- Ibis analytics, with Ibis (and more!)☆23Updated last year
- Python+VueJS application to load, explore, combine,transform and deliver data☆102Updated 10 months ago
- Mapping of DWH database tables to business entities, attributes & metrics in Python, with automatic creation of flattened tables☆75Updated 2 years ago
- ☆55Updated last week
- dagster scikit-learn pipeline example.☆46Updated 2 years ago
- 🦖 A SQL-on-everything Query Engine you can execute over multiple databases and file formats. Query your data, where it lives.☆110Updated 2 weeks ago
- A small Python module containing quick utility functions for standard ETL processes.☆37Updated 3 weeks ago
- Resources for tackling record linkage / deduplication / data matching problems☆126Updated last year
- Swiple enables you to easily observe, understand, validate and improve the quality of your data☆84Updated this week
- A tool to automatically infer columns data types in .csv files☆37Updated 2 years ago
- Application and python script to identify, remove, and/or recode personally identifiable information (PII) from field experiment datasets…☆46Updated this week
- ☆23Updated last year
- An automation tool to refactor Jupyter Notebooks to Python modules, with code dependency analysis.☆12Updated 10 months ago
- A monorepo of many Rill example projects☆47Updated 3 weeks ago
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆114Updated 2 months ago
- Toolkit for graph-relational data across space and time☆118Updated last year