SuperMinHash: A New Minwise Hashing Algorithm for Jaccard Similarity Estimation, Simhash and SimhashIndex
☆19Nov 18, 2022Updated 3 years ago
Alternatives and similar repositories for superminhash
Users that are interested in superminhash are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- SuperMinHash: A New Minwise Hashing Algorithm for Jaccard Similarity Estimation☆25Jan 1, 2018Updated 8 years ago
- Rust implementation of probminhash, superminhash and hyperloglog sketching algorithms☆30Jan 22, 2026Updated 5 months ago
- Repository for GazeVisual performance evaluation software tools☆10Jul 30, 2019Updated 6 years ago
- ☆13Aug 13, 2021Updated 4 years ago
- A Python Implementation of Simhash Algorithm☆1,037Mar 24, 2022Updated 4 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A light, pure and convenient command-line dictionary that helps you focus on memorizing words.☆15Aug 18, 2021Updated 4 years ago
- ☆13Dec 8, 2022Updated 3 years ago
- Hopfield Networks for unsupervised learning in Haskell☆16Apr 13, 2014Updated 12 years ago
- Common Voice Generator using Speech Synthesizer☆14Jul 28, 2021Updated 4 years ago
- Poincaré Embeddings for Learning Hierarchical Representations (https://arxiv.org/abs/1705.08039) in PyTorch☆15Dec 20, 2017Updated 8 years ago
- Repository for lecture "Data-Driven Demand Learning and Dynamic Pricing Strategies in Competitive Markets"☆12May 8, 2018Updated 8 years ago
- Random Data Generator for arbitrary data types☆29Mar 23, 2023Updated 3 years ago
- Open Thai Wikipedia QA Dataset made by iApp Technology☆14Feb 17, 2021Updated 5 years ago
- Generic modeling of object relations in OOP☆14Jan 20, 2024Updated 2 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Visual Hash for matching copies of visually similar images.☆16Mar 17, 2025Updated last year
- Naive Bayes classifier for detection of langage and spelling correction☆10Mar 2, 2020Updated 6 years ago
- A Benchmark Data Set for Community Question-Answering Research☆41Jul 24, 2017Updated 8 years ago
- Highly specialized crate to parse and use `google/sentencepiece` 's precompiled_charsmap in `tokenizers`☆23Jun 9, 2026Updated 3 weeks ago
- High Performance Java NoSQL Database & ORM☆12May 22, 2026Updated last month
- ☆10Mar 15, 2021Updated 5 years ago
- MiniLM (BERT) embeddings from scratch☆20Aug 14, 2025Updated 10 months ago
- Simple SVG box plots in React☆10Sep 4, 2020Updated 5 years ago
- Parallel Universal Dependencies.☆15May 6, 2026Updated last month
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆10Jun 22, 2020Updated 6 years ago
- Elevator is an open source, on-disk key-value store. Provides high-performance bulk read-write operations over very large datasets while …☆70May 14, 2014Updated 12 years ago
- MIDict (Multi-Index Dict) can be indexed by any "keys" or "values", suitable as a bidirectional/inverse dict or a multi-key/multi-value d…☆14May 19, 2016Updated 10 years ago
- A cloud native data mesh implementation☆12Jan 15, 2021Updated 5 years ago
- RAG-Fusion implementation using Langchain, Weaviate and OpenAI☆13Oct 31, 2023Updated 2 years ago
- Fast text chunking algorithms for Python☆12Oct 7, 2020Updated 5 years ago
- Video library for Java on Linux☆12Nov 18, 2019Updated 6 years ago
- Thai smart home corpus with "Gowajee" hotword☆19Jul 30, 2023Updated 2 years ago
- ☆23Apr 29, 2025Updated last year
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- An efficient simhash implementation for python☆127Oct 25, 2019Updated 6 years ago
- Python API for Various DB-Backed Simhash Clusters☆64Mar 16, 2017Updated 9 years ago
- BM25F demo with lucene using BlendedTermQuery and a custom similarity☆14Oct 11, 2016Updated 9 years ago
- Poincare Embeddings for Word Vector Representations☆18Oct 29, 2017Updated 8 years ago
- A real-time, node-based video effects compositor for the web built with HTML5, Javascript and WebGL☆13Jan 2, 2021Updated 5 years ago
- ☆14Apr 13, 2026Updated 2 months ago
- Gradle plugin that wraps your JVM application to a new Docker image.☆11Jun 17, 2026Updated 2 weeks ago