peerside / awesome-data-wranglingLinks
A curated list of data wrangling resources
β38Updated 6 years ago
Alternatives and similar repositories for awesome-data-wrangling
Users that are interested in awesome-data-wrangling are comparing it to the libraries listed below
Sorting:
- π A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF, Spark and Vaex)β141Updated 2 years ago
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.β59Updated this week
- A curated list of awesome open source tools and commercial products to catalog, version, and manage data πβ33Updated 3 years ago
- A Python library to generate static data catalog sites. Carte scrapes metadata from your data assets and generates a fully searchable froβ¦β28Updated 3 years ago
- Python ELT Studio, an application for building ELT (and ETL) data flows.β58Updated 3 years ago
- A monorepo of many Rill example projectsβ40Updated 2 weeks ago
- CLI for creating databases for Data Quality Dashboards.β19Updated 5 years ago
- Techniques for Scraping the Web in Pythonβ25Updated 7 years ago
- Awesome Orchest projects, both official and submitted by the community.β25Updated last year
- dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.β57Updated 3 years ago
- A browser-based Parquet file viewerβ46Updated 3 weeks ago
- Execute OpenRefine JSON scripts without OpenRefine (or Java)β30Updated 2 years ago
- Repo demonstrating a Dagster pipeline to generate Neo4j Graphβ21Updated 4 years ago
- β41Updated 5 months ago
- Python based Wikidata framework for easy dataframe extractionβ45Updated last year
- Open Semantic Visual Linked Data Graph Explorer: Open Source tool (web app) and user interace (UI) for discovery, exploration and visualiβ¦β84Updated 5 years ago
- A curated list of dagster code snippets for data engineersβ56Updated last year
- A Singer tap for extracting data from the GitHub APIβ74Updated this week
- new skills taxonomy using TextKernel dataβ33Updated 2 years ago
- This repository contains example implementations for KNIME Analytics Platform.β18Updated this week
- Scrape various open data directories to create an index of what's available out thereβ37Updated 5 months ago
- GraphiPy: Universal Social Data Extractorβ84Updated 2 years ago
- A maximum-strength name parser for record linkage.β37Updated last month
- Entity resolution, also known as Data Matching or Record linkage is the task of finding a data set that refer to the same or similar realβ¦β24Updated 3 months ago
- A project for exploring how Great Expectations can be used to ensure data quality and validate batches within a data pipeline defined in β¦β22Updated 2 years ago
- API for OpenSanctions with support for entity search and bulk matching of data collections. Supports Reconciliation API spec.β87Updated this week
- The Data Explorer is nteract's automatic visualization tool.β108Updated 2 years ago
- The code for the Sales Dashboard demoβ16Updated 2 months ago
- KNIME Python Integrationβ79Updated this week
- Building 3D Trusted Data Pipelines With Dagster, Dbt, and Duckdbβ21Updated last year