ekzhu / josie
Code and Benchmarks for JOSIE (SIGMOD 2019)
☆18Updated last year
Alternatives and similar repositories for josie:
Users that are interested in josie are comparing it to the libraries listed below
- Benchmark Datasets for Set Similarity Search☆12Updated 6 years ago
- LSH index for approximate set containment search☆57Updated 2 years ago
- A Jupyter notebook extension to centralize and manage data☆14Updated 2 years ago
- Graph Engine for Exploration and Search☆40Updated last year
- Sketch and LSH Index library for Java, including OPH methods as well as the Lazo method☆13Updated last year
- Explaining Inference Queries with Bayesian Optimization☆10Updated 3 years ago
- ☆75Updated last year
- ☆25Updated 6 years ago
- D3L dataset discovery framework - an implementation of the ICDE 2020 paper with the same name: https://arxiv.org/pdf/2011.10427.pdf☆20Updated 3 years ago
- ☆19Updated 3 years ago
- A tool facilitating matching for any dataset discovery method. Also, an extensible experiment suite for state-of-the-art schema matching …☆84Updated 2 months ago
- Code for extracting, parsing and annotating tables from GitTables (https://gittables.github.io).☆42Updated 3 years ago
- Characterization of relational table embeddings (VLDB 2024).☆25Updated 6 months ago
- Repository with an overview of the tutorial on Models and Practice of Neural Table Representations and up to date material for the hands-…☆20Updated last year
- Data-Centric What-If Analysis for Native Machine Learning Pipelines☆15Updated last year
- Implementation of the G-CORE graph query language on Spark☆15Updated 3 years ago
- simd enabled column imprints☆11Updated 6 years ago
- Benchmark study on KùzuDB, an embedded OLAP graph database, on an artificial social network dataset☆32Updated last month
- Succinct C++☆25Updated 4 years ago
- Master's thesis project involving label-constrained reachability (LCR)Updated 3 years ago
- ☆11Updated last year
- ☆11Updated 3 years ago
- DuckDB is an in-process SQL OLAP Database Management System☆42Updated 2 weeks ago
- ☆42Updated last year
- Sempala is a SPARQL-over-SQL approach to provide interactive-time SPARQL query processing on Hadoop. It stores RDF data in a columnar lay…☆12Updated 7 years ago
- A proposed standard `NOCK` for a Parquet format that supports efficient distributed serialization of multiple kinds of graph technologies☆19Updated 2 years ago
- Python bindings for the fast integer compression library FastPFor.☆57Updated last year
- A conda-smithy repository for python-duckdb.☆13Updated 2 months ago
- state-of-the-art search over vector embeddings and structured data (SIGMOD '24)☆64Updated 7 months ago
- Parameterless and Universal FInding of Nearest Neighbors☆57Updated 8 months ago