darenasc / aeda
Build a data catalog by running a single line of code
☆16Updated 2 months ago
Related projects ⓘ
Alternatives and complementary repositories for aeda
- ☆29Updated 11 months ago
- Comparing Polars to Pandas and a small introduction☆43Updated 3 years ago
- Blog post on ETL pipelines with Airflow☆23Updated 4 years ago
- A Python library to generate static data catalog sites. Carte scrapes metadata from your data assets and generates a fully searchable fro…☆27Updated 2 years ago
- hooqu is a library built on top of Pandas-like Dataframes for defining "unit tests for data". This is a spiritual port of Apache Deequ to…☆26Updated this week
- A scikit-learn compatible estimator based on business-rules with interactive dashboard included☆28Updated 3 years ago
- Build your feature store with macros right within your dbt repository☆37Updated last year
- Evaluation Matrix for Change Data Capture☆23Updated 3 months ago
- 🔍Your Data Quality Detector / Gain insight into your data and get it ready for use before you start working with it 💡📊🛠💎☆16Updated 2 years ago
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise it☆26Updated 8 months ago
- Automated Exploratory Data Analysis. Simplifying Data Exploration☆34Updated 4 years ago
- Talk "Beyond pandas: The great Python dataframe showdown"☆37Updated 2 years ago
- Building an API with the FastAPI framework to serve a scikit-learn model.☆18Updated 5 years ago
- ☕⛵WIP PySpark dependency management☆22Updated 6 years ago
- Automated Jupyter notebook testing. 📙☆41Updated 9 months ago
- Record matching and entity resolution at scale in Spark☆31Updated last year
- A maximum-strength name parser for record linkage.☆34Updated 3 months ago
- A utility for labeling clusters of text data.☆28Updated 3 years ago
- A utility tool to automate certain tasks with Jupyter notebooks.☆9Updated 8 months ago
- ☆12Updated last year
- A small Python module containing quick utility functions for standard ETL processes.☆33Updated last week
- Building 3D Trusted Data Pipelines With Dagster, Dbt, and Duckdb☆18Updated last year
- How to do data science with Optimus, Spark and Python.☆18Updated 5 years ago
- Repo demonstrating a Dagster pipeline to generate Neo4j Graph☆21Updated 3 years ago
- A markdown wiki and dashboarding system for Datasette☆21Updated 3 years ago
- this repo contains the draft, images, and code for the Medium blog post on altair themes.☆12Updated 6 years ago
- A streamlit component to embed Disqus in your applications.☆11Updated 3 years ago
- A curated list of awesome open source tools and commercial products to catalog, version, and manage data 🚀☆27Updated 2 years ago
- Dask integration for Snowflake☆30Updated last week