peerside / awesome-data-wranglingLinks
A curated list of data wrangling resources
β39Updated 6 years ago
Alternatives and similar repositories for awesome-data-wrangling
Users that are interested in awesome-data-wrangling are comparing it to the libraries listed below
Sorting:
- A curated list of awesome open source tools and commercial products to catalog, version, and manage data πβ33Updated 3 years ago
- A visual data pipeline builder with various backendsβ104Updated last week
- CLI for creating databases for Data Quality Dashboards.β19Updated 5 years ago
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.β61Updated this week
- π A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF, Spark and Vaex)β141Updated 2 years ago
- A lightweight, standardized library accessing files and datasets, especially tabular ones (CSV, Excel).β73Updated 2 years ago
- A Singer tap for extracting data from the GitHub APIβ74Updated 2 weeks ago
- Python based Wikidata framework for easy dataframe extractionβ45Updated last year
- A Python library to generate static data catalog sites. Carte scrapes metadata from your data assets and generates a fully searchable froβ¦β28Updated 3 years ago
- The classic desktop version of osDQβ10Updated 3 years ago
- Named-Entity Recognition extension for OpenRefineβ29Updated 2 years ago
- a set of scripts to pull meta data and data profiling metrics from relational database systemsβ77Updated last year
- A maximum-strength name parser for record linkage.β38Updated last month
- Techniques for Scraping the Web in Pythonβ25Updated 7 years ago
- dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.β57Updated 3 years ago
- Open Semantic Visual Linked Data Graph Explorer: Open Source tool (web app) and user interace (UI) for discovery, exploration and visualiβ¦β85Updated 5 years ago
- Python package to visualise SQL queries as graphsβ48Updated last year
- TinyOlap is a light-weight, in-process, in-memory, multi-dimensional, model-first OLAP engine for planning, budgeting, reporting, analysiβ¦β48Updated 3 years ago
- The OpenRefine Python Client from Paul Makepeace provides a library for communicating with an OpenRefine server. This fork extends the coβ¦β85Updated 3 years ago
- KNOTS is an intuitive desktop application built to simplify the configuration of Singer pipelinesβ67Updated 2 years ago
- Data Quality and Observability platform for the whole data lifecycle, from profiling new data sources to full automation with Data Observβ¦β160Updated 2 weeks ago
- Centralized whale instance using github actions, sourcing metadata from bigquery-public-data.β17Updated last year
- A modern relational spreadsheet πβ51Updated 2 years ago
- A monorepo of many Rill example projectsβ42Updated this week
- A web application for data exploration, machine learning and statistical analysis, model construction and meta analysis tools, that integβ¦β25Updated 4 years ago
- Data models for Hubspot built using dbt.β35Updated last week
- Repo demonstrating a Dagster pipeline to generate Neo4j Graphβ22Updated 4 years ago
- Ricgraph - Research in context graphβ30Updated last month
- Secure Enterprise Master Patient Indexβ30Updated 3 weeks ago
- A python client library for the Stitch Import APIβ42Updated last year