peerside / awesome-data-wranglingLinks
A curated list of data wrangling resources
β39Updated 7 years ago
Alternatives and similar repositories for awesome-data-wrangling
Users that are interested in awesome-data-wrangling are comparing it to the libraries listed below
Sorting:
- A curated list of awesome open source tools and commercial products to catalog, version, and manage data πβ39Updated 3 years ago
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.β66Updated last week
- π A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF, Spark and Vaex)β141Updated 2 years ago
- Execute OpenRefine JSON scripts without OpenRefine (or Java)β31Updated 3 years ago
- CLI for creating databases for Data Quality Dashboards.β19Updated 6 years ago
- Python package to visualise SQL queries as graphsβ51Updated 2 years ago
- DataHub.io awesome datasets - curated collections of high quality dataset organized by topicβ61Updated last year
- Techniques for Scraping the Web in Pythonβ27Updated 7 years ago
- A Python library to generate static data catalog sites. Carte scrapes metadata from your data assets and generates a fully searchable froβ¦β29Updated 3 years ago
- CubETL - Framework and tool for data ETL (Extract, Transform and Load) in Python (PERSONAL PROJECT / SELDOM MAINTAINED)β28Updated 3 years ago
- Open Semantic Visual Linked Data Graph Explorer: Open Source tool (web app) and user interace (UI) for discovery, exploration and visualiβ¦β89Updated 6 years ago
- a set of scripts to pull meta data and data profiling metrics from relational database systemsβ77Updated last year
- A lightweight, standardized library accessing files and datasets, especially tabular ones (CSV, Excel).β75Updated 2 years ago
- The classic desktop version of osDQβ10Updated 3 years ago
- DataFlows is a simple, intuitive lightweight framework for building data processing flows in python.β222Updated 9 months ago
- The Open Data Editor (ODE) is a no-code application to explore and validate tabular data in a simple way. Forever free and open source prβ¦β299Updated 2 weeks ago
- Python based Wikidata framework for easy dataframe extractionβ45Updated 2 years ago
- A maximum-strength name parser for record linkage.β39Updated 5 months ago
- Open Supply Chains is the opensource codebase behind Sourcemap that allows anyone to visualize and analyze supply chains. It does this prβ¦β31Updated 5 years ago
- Data validation as a service. Project retired, got to the current one at frictionsless/repositoryβ69Updated 3 years ago
- Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & Nβ¦β277Updated 3 years ago
- dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.β57Updated 4 years ago
- The OpenRefine Python Client from Paul Makepeace provides a library for communicating with an OpenRefine server. This fork extends the coβ¦β86Updated 4 years ago
- data wrangling simplicity, complete audit transparency, and at speedβ35Updated 4 months ago
- Framework for processing data packages in pipelines of modular components.β123Updated 7 months ago
- KNOTS is an intuitive desktop application built to simplify the configuration of Singer pipelinesβ67Updated 3 years ago
- Data presentation framework for Python that generates static sites from extended Markdown with interactive charts, tables, scripts, and oβ¦β99Updated last year
- TinyOlap is a light-weight, in-process, in-memory, multi-dimensional, model-first OLAP engine for planning, budgeting, reporting, analysiβ¦β51Updated 3 years ago
- A web application for data exploration, machine learning and statistical analysis, model construction and meta analysis tools, that integβ¦β25Updated 4 years ago
- Awesome list of the software tools related to opendata: data catalogs, ingestion tools, data prep tools and so onβ35Updated 3 months ago