tdoehmen / gitschemas
☆9Updated last year
Related projects: ⓘ
- Code for extracting, parsing and annotating tables from GitTables (https://gittables.github.io).☆40Updated 2 years ago
- Graph Engine for Exploration and Search☆39Updated 7 months ago
- A Jupyter notebook extension to centralize and manage data☆14Updated last year
- DuckDB is an in-process SQL OLAP Database Management System☆38Updated last week
- Sketch and LSH Index library for Java, including OPH methods as well as the Lazo method☆13Updated 8 months ago
- ☆19Updated last year
- Learn2Clean: Optimizing the Sequence of Tasks for Data Preparation and Cleaning☆47Updated last year
- Benchmark study on KùzuDB, an embedded OLAP graph database, on an artificial social network dataset☆24Updated last month
- Browser-based user interface for Kùzu graph database☆29Updated 3 weeks ago
- DuckDB extension that adds support for SQL/PGQ☆60Updated last week
- ☆15Updated 2 years ago
- Project overview and links to various resources☆17Updated 2 years ago
- A proposed standard `NOCK` for a Parquet format that supports efficient distributed serialization of multiple kinds of graph technologies☆15Updated last year
- Explaining Inference Queries with Bayesian Optimization☆10Updated 3 years ago
- Code and Benchmarks for JOSIE (SIGMOD 2019)☆18Updated last year
- A systematic Benchmarking on the performance of Spark-SQL for processing Vast RDF datasets☆14Updated 2 years ago
- ☆25Updated this week
- Sample code to accompany blog post showcasing Arrow Flight SQL running on DuckDB☆28Updated last year
- Inspect ML Pipelines in Python in the form of a DAG☆68Updated 6 months ago
- High-performance data retrieval from Neo4j with Apache Arrow 🏹☆31Updated 2 years ago
- FlexMatcher is a schema matching package in Python which handles the problem of matching multiple schemas to a single mediated schema.☆31Updated this week
- Ibis Substrait Compiler☆92Updated this week
- Example for simple Apache Arrow Flight service with Apache Spark and TensorFlow clients☆36Updated 3 years ago
- Characterization of relational table embeddings (VLDB 2024).☆22Updated 2 months ago
- Record matching and entity resolution at scale in Spark☆31Updated 10 months ago
- quadipy is a python package to help transform structured data into RDF graph format☆18Updated last year
- Pattern-based table discovery in Open Data CSV files☆19Updated last year
- An open-source library that leverages Python’s data science ecosystem to build powerful end-to-end Entity Resolution workflows.☆69Updated 3 weeks ago
- Data-Centric What-If Analysis for Native Machine Learning Pipelines☆15Updated last year
- ☆28Updated 2 years ago