peerside / awesome-data-wrangling
A curated list of data wrangling resources
☆35Updated 6 years ago
Alternatives and similar repositories for awesome-data-wrangling:
Users that are interested in awesome-data-wrangling are comparing it to the libraries listed below
- Centralized whale instance using github actions, sourcing metadata from bigquery-public-data.☆17Updated 9 months ago
- A Python library to generate static data catalog sites. Carte scrapes metadata from your data assets and generates a fully searchable fro…☆27Updated 2 years ago
- Awesome Orchest projects, both official and submitted by the community.☆25Updated last year
- A curated list of awesome open source tools and commercial products to catalog, version, and manage data 🚀☆32Updated 2 years ago
- 🚕 A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF, Spark and Vaex)☆141Updated last year
- CLI for creating databases for Data Quality Dashboards.☆19Updated 5 years ago
- A maximum-strength name parser for record linkage.☆36Updated last month
- portable Python ML-powered data bot☆23Updated 6 months ago
- This repo contains code to get you started creating R and Python-based custom functions in Google Sheets.☆19Updated 4 years ago
- A tool to generate PySpark schema from JSON.☆28Updated last year
- A monorepo of many Rill example projects☆35Updated this week
- This repository contains example implementations for KNIME Analytics Platform.☆17Updated 3 months ago
- 📙 Notebooks Academy: Write Production-Ready Code From Jupyter.☆13Updated 2 years ago
- This is a compilation of Data Governance resources, examples, models and communities☆12Updated 5 years ago
- ☆47Updated last year
- ☆37Updated last month
- The classic desktop version of osDQ☆10Updated 2 years ago
- Runnable e-commerce mini data warehouse based on Python, PostgreSQL & Metabase, template for new projects☆29Updated 3 years ago
- Code to help generate SQL for stakeholders. Code at https://www.startdataengineering.com/post/data-democratize-llm/☆11Updated 10 months ago
- Configuration and schema sync for Metabase from Python☆19Updated 2 years ago
- ☆21Updated this week
- A browser user interface for manual labeling of record pairs.☆45Updated last year
- This repository is a production dbt pipeline example that model the profitability of an e-commerce business. Data is extracted and loaded…☆21Updated 9 months ago
- Python ELT Studio, an application for building ELT (and ETL) data flows.☆57Updated 3 years ago
- ☆69Updated last month
- bamboolib - template for creating your own binder notebook☆21Updated 3 years ago
- Data models for Hubspot built using dbt.☆35Updated 3 weeks ago
- The Taxonomy for ETL Automation Metadata (TEAM) is a tool for design metadata management geared towards data warehouse automation. It is …☆36Updated last month
- dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.☆57Updated 3 years ago
- Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observ…☆134Updated 2 months ago