LSHDB is a parallel and distributed data engine, which relies on Locality-Sensitive Hashing and noSQL systems, for performing record linkage (and privacy-preserving record linkage) and similarity search tasks.
☆31Aug 30, 2022Updated 3 years ago
Alternatives and similar repositories for LSHDB
Users that are interested in LSHDB are comparing it to the libraries listed below
Sorting:
- Perform Bayesian record linkage with a one-to-one matching assumption.☆11Jul 9, 2020Updated 5 years ago
- Implementation of the paper "Deep Indexed Active Learning for Matching Heterogeneous Entity Representations"☆17Dec 20, 2021Updated 4 years ago
- ☆23Dec 4, 2023Updated 2 years ago
- Ubiflux Vigor ventilation system RS485 Modbus communications with Python☆11Feb 20, 2026Updated last week
- ☆11Dec 12, 2025Updated 2 months ago
- "Actionable Ethics for Data Scientists" Workshop Material @ ODSC☆10May 31, 2024Updated last year
- Collect and filter location information from social network services.☆11Jun 14, 2020Updated 5 years ago
- ClusterTech Parallel Filesystem☆12May 18, 2018Updated 7 years ago
- The source code for my PyCon 2017 talk "5 ways to deploy you Python web app in 2017"☆10May 19, 2017Updated 8 years ago
- Galaxy is a lightweight software deployment and management tool. We use it at Ning to manage the Java cores and Apache httpd instances th…☆21Sep 11, 2011Updated 14 years ago
- A powerful Unity ECS system to render massive numbers of animated sprites.☆11Sep 22, 2019Updated 6 years ago
- Open-source free TypeScript library to implement SMART Health Cards and Links☆24Dec 18, 2025Updated 2 months ago
- My dotfiles☆14Feb 5, 2021Updated 5 years ago
- 一个简易的正则表达式引擎!☆10Apr 9, 2017Updated 8 years ago
- Hungarian tokenizer.☆14Mar 15, 2022Updated 3 years ago
- A JMM Cookbook for Java Developers(as opposed to a cookbook for Compiler Writers)☆12Jun 13, 2014Updated 11 years ago
- Real-time time series prediction library with standalone server☆40Jun 7, 2021Updated 4 years ago
- ACL Rolling Review website☆11Updated this week
- PhipsBoot is a relocatable x86_64 bootloader for legacy boot written in Rust and assembly.☆14Mar 2, 2025Updated 11 months ago
- Code repository for R Data Mining Blueprints, published by Packt☆10Jan 14, 2021Updated 5 years ago
- Latr: Lazy Translation Coherence - ASPLOS'18☆16Nov 15, 2021Updated 4 years ago
- This project is the implementation of Li-Roth paper "Learning Question Classifiers" on TREC dataset☆12Mar 7, 2017Updated 8 years ago
- xonsh readable traceback☆12Sep 6, 2022Updated 3 years ago
- A performance test comparing Scala verses Erlang with simple agents to determine messaging performance☆24Feb 15, 2013Updated 13 years ago
- The toolkit called magyarlanc aims at the basic linguistic processing of Hungarian texts. The toolkit consists of only JAVA modules (the…☆14Jun 21, 2016Updated 9 years ago
- Erlang VoltDB server interface☆40Apr 9, 2013Updated 12 years ago
- an informative progress bar for Python 2+3 command-line tools☆12Nov 18, 2019Updated 6 years ago
- A rangeset utility for python☆14Dec 12, 2023Updated 2 years ago
- ☆13Aug 6, 2019Updated 6 years ago
- Jupyter notebooks - A tool to write and share executable notebooks and data visualization☆10Feb 5, 2026Updated 3 weeks ago
- Home Assistant custom component for Pollen Information in Hungary☆15Jul 17, 2024Updated last year
- Install python dependencies automatically at runtime☆13Feb 16, 2016Updated 10 years ago
- Stream processing engine☆13Apr 7, 2021Updated 4 years ago
- ☆11Dec 22, 2022Updated 3 years ago
- atyimo: probabilistic record linkage for massive administrative datasets☆10Jan 23, 2019Updated 7 years ago
- A JavaFX implementation of a JSON data driven GUI.☆11May 9, 2017Updated 8 years ago
- ☆32Jan 25, 2026Updated last month
- Docker Shell CLI plugin☆10Feb 6, 2026Updated 3 weeks ago
- Combining encoder-based language models☆11Nov 11, 2021Updated 4 years ago