ekzhu / josie
Code and Benchmarks for JOSIE (SIGMOD 2019)
☆18Updated 2 years ago
Alternatives and similar repositories for josie:
Users that are interested in josie are comparing it to the libraries listed below
- LSH index for approximate set containment search☆57Updated 2 years ago
- ☆77Updated 2 years ago
- Explaining Inference Queries with Bayesian Optimization☆10Updated 4 years ago
- Sketch and LSH Index library for Java, including OPH methods as well as the Lazo method☆13Updated last year
- A Jupyter notebook extension to centralize and manage data☆14Updated 2 years ago
- Graph Engine for Exploration and Search☆40Updated last year
- ☆19Updated 3 years ago
- Data-Centric What-If Analysis for Native Machine Learning Pipelines☆16Updated last year
- A tool facilitating matching for any dataset discovery method. Also, an extensible experiment suite for state-of-the-art schema matching …☆88Updated last month
- A fast header-only graph-based index for approximate nearest neighbor search (ANNS). https://flatnav.net☆20Updated last week
- Project overview and links to various resources☆19Updated 3 years ago
- D3L dataset discovery framework - an implementation of the ICDE 2020 paper with the same name: https://arxiv.org/pdf/2011.10427.pdf☆20Updated 3 years ago
- ⚡ Faster vector search with PDX: A vertical data layout for vectors☆33Updated last week
- Inspect ML Pipelines in Python in the form of a DAG☆70Updated last year
- simd enabled column imprints☆11Updated 7 years ago
- Parameterless and Universal FInding of Nearest Neighbors☆59Updated last month
- The BART Project: Benchmarking Algorithms for (data) Repairing and Translation☆41Updated last year
- DuckDB is an in-process SQL OLAP Database Management System☆43Updated last week
- simialrity join or search on spark core directly☆27Updated 4 years ago
- Paper list about adopting machine learning techniques into data management tasks.☆37Updated 4 years ago
- The Llunatic Mapping and Cleaning Chase Engine☆36Updated last year
- Python bindings for the fast integer compression library FastPFor.☆58Updated last year
- Condor allows for the specification of synopsis-based streaming jobs on top of general dataflow systems. Condor provides a collection of …☆13Updated 10 months ago
- Master's thesis project involving label-constrained reachability (LCR)Updated 4 years ago
- FlexMatcher is a schema matching package in Python which handles the problem of matching multiple schemas to a single mediated schema.☆29Updated 4 months ago
- A System for (Optimized) Semantic Computation☆97Updated this week
- Code repo for "An Empirical Evaluation of Columnar Storage Formats" VLDB Vol 17☆54Updated 11 months ago
- ☆11Updated 4 years ago
- Labelled Subgraph Query Benchmark – A lightweight benchmark suite focusing on subgraph matching queries. Note: This is a microbenchmark f…☆31Updated last month
- ☆24Updated 3 years ago