OpenRefine / OpenRefine
OpenRefine is a free, open source power tool for working with messy data and improving it
☆11,070Updated this week
Alternatives and similar repositories for OpenRefine:
Users that are interested in OpenRefine are comparing it to the libraries listed below
- The simplest, fastest way to get business intelligence and analytics to everyone in your company☆40,196Updated this week
- A web interface to create custom vector-based visualizations on top of RAWGraphs core☆8,707Updated last week
- Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, vis…☆18,064Updated this week
- CKAN is an open-source DMS (data management system) for powering data hubs and data portals. CKAN makes it easy to publish, share and use…☆4,553Updated this week
- Apache Airflow - A platform to programmatically author, schedule, and monitor workflows☆38,573Updated this week
- Jupyter Interactive Notebook☆11,992Updated this week
- The leader in Next-Generation Customer Data Infrastructure☆6,878Updated 5 months ago
- A suite of utilities for converting to and working with CSV, the king of tabular file formats.☆6,080Updated 5 months ago
- Parallel computing with task scheduling☆12,906Updated this week
- A visualization grammar.☆11,349Updated this week
- A curated list of awesome ETL frameworks, libraries, and software.☆3,332Updated 6 months ago
- Data Apps & Dashboards for Python. No JavaScript Required.☆21,881Updated this week
- The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lak…☆17,127Updated this week
- Tabula is a tool for liberating data tables trapped inside PDF files☆6,891Updated 4 months ago
- Free, open-source SQL client for Windows and Mac 🦅☆5,131Updated last year
- Beaker Extensions for Jupyter Notebook☆2,806Updated last year
- Data-Centric Pipelines and Data Versioning☆6,203Updated this week
- Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics☆14,950Updated this week
- Actively curated list of awesome BI tools. PRs welcome!☆2,122Updated 5 months ago
- A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.☆4,216Updated 2 months ago
- Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.☆6,454Updated 2 weeks ago
- The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).☆2,700Updated this week
- TypeDB: the power of programming, in your database☆3,916Updated this week
- Web UI for PrestoDB.☆2,752Updated 3 years ago
- OrientDB is the most versatile DBMS supporting Graph, Document, Reactive, Full-Text and Geospatial models in one Multi-Model product. Ori…☆4,766Updated this week
- Kepler.gl is a powerful open source geospatial analysis tool for large-scale data sets.☆10,554Updated this week
- Visual scraping for Scrapy☆9,342Updated 7 months ago
- Apache Parquet Java☆2,704Updated this week
- An open-source graph database☆14,874Updated last month
- AI + Data, online. https://vespa.ai☆5,958Updated this week