peerside / awesome-data-wranglingLinks
A curated list of data wrangling resources
β39Updated 7 years ago
Alternatives and similar repositories for awesome-data-wrangling
Users that are interested in awesome-data-wrangling are comparing it to the libraries listed below
Sorting:
- π A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF, Spark and Vaex)β141Updated 2 years ago
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.β66Updated 2 weeks ago
- A visual data pipeline builder with various backendsβ107Updated this week
- A curated list of awesome open source tools and commercial products to catalog, version, and manage data πβ39Updated 3 years ago
- The Taxonomy for ETL Automation Metadata (TEAM) is a tool for design metadata management geared towards data warehouse automation. It is β¦β37Updated 11 months ago
- CLI for creating databases for Data Quality Dashboards.β19Updated 6 years ago
- Centralized whale instance using github actions, sourcing metadata from bigquery-public-data.β17Updated last year
- a set of scripts to pull meta data and data profiling metrics from relational database systemsβ77Updated last year
- A Python library to generate static data catalog sites. Carte scrapes metadata from your data assets and generates a fully searchable froβ¦β29Updated 3 years ago
- The Data Explorer is nteract's automatic visualization tool.β107Updated 3 years ago
- Data lineage tools in pythonβ47Updated last year
- Repo demonstrating a Dagster pipeline to generate Neo4j Graphβ22Updated 4 years ago
- Hephaestus - ETL and ML tools for OHDSI - OMOP CDMβ13Updated 4 months ago
- Techniques for Scraping the Web in Pythonβ27Updated 7 years ago
- Python bindings for the Stardog Knowledge Graph platformβ41Updated 3 months ago
- A monorepo of many Rill example projectsβ47Updated last week
- β44Updated last month
- DataFlows is a simple, intuitive lightweight framework for building data processing flows in python.β222Updated 8 months ago
- CubETL - Framework and tool for data ETL (Extract, Transform and Load) in Python (PERSONAL PROJECT / SELDOM MAINTAINED)β28Updated 3 years ago
- A lightweight, standardized library accessing files and datasets, especially tabular ones (CSV, Excel).β75Updated 2 years ago
- Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observβ¦β179Updated 3 weeks ago
- UMLS Terminology Serverβ19Updated this week
- A maximum-strength name parser for record linkage.β39Updated 4 months ago
- TinyOlap is a light-weight, in-process, in-memory, multi-dimensional, model-first OLAP engine for planning, budgeting, reporting, analysiβ¦β51Updated 3 years ago
- β11Updated 4 years ago
- Python package to visualise SQL queries as graphsβ50Updated 2 years ago
- Postgres utility package for dbt (getdbt.com)β19Updated last month
- dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.β57Updated 3 years ago
- Python based Wikidata framework for easy dataframe extractionβ45Updated 2 years ago
- A web application for data exploration, machine learning and statistical analysis, model construction and meta analysis tools, that integβ¦β25Updated 4 years ago