mendesk / image-ndd-lshLinks
Near-duplicate image detection using Locality Sensitive Hashing
☆76Updated 4 years ago
Alternatives and similar repositories for image-ndd-lsh
Users that are interested in image-ndd-lsh are comparing it to the libraries listed below
Sorting:
- Fast Near-Duplicate Image Search and Delete using pHash, t-SNE and KDTree.☆161Updated 2 years ago
- locality sensitive hashing (LSHASH) for Python3☆70Updated 3 months ago
- Input text or image, get back matching image fashion results, using Jina, DocArray, and CLIP☆50Updated 2 years ago
- hnsw implemented by python☆69Updated 6 years ago
- PyTorch-IE: State-of-the-art Information Extraction in PyTorch☆78Updated 3 weeks ago
- MultiOCR, an interface that connects multiple open-source OCR and various Cloud OCR.☆31Updated 2 years ago
- N-gram keyword extraction using spaCy and pretrained language models☆62Updated 3 years ago
- Fast edit distance Python extension written in Cython/C++. Supports Levenshtein distance and Damerau Optimal String Alignment (OSA) dista…☆24Updated 2 months ago
- ☆28Updated 2 years ago
- Source code and data for Like a Good Nearest Neighbor☆30Updated 7 months ago
- Simply, faster, sentence-transformers☆143Updated last year
- Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents☆291Updated 2 years ago
- Product Quantization k-Nearest Neighbors☆20Updated 4 years ago
- string embed for fast edit distance computation, codes for [Convolutional Embedding for Edit Distance (SIGIR 20)].☆61Updated 2 years ago
- Framework to build your own reverse image search engine☆83Updated 5 years ago
- python library to perform Locality-Sensitive Hashing for faster nearest neighbors search in high dimensional data☆19Updated last year
- Efficient Trie-based regex unions for blacklist/whitelist filtering and one-pass mapping-based string replacing☆74Updated 2 months ago
- 🛠️ Tools for Transformers compression using PyTorch Lightning ⚡☆84Updated 9 months ago
- Python package to generate image embeddings with CLIP without PyTorch/TensorFlow☆152Updated 3 years ago
- This repository contains the code for the paper 'PARM: Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval' pu…☆41Updated 3 years ago
- The largest multilingual image-text classification dataset. It contains fashion products.☆73Updated 2 years ago
- Bi-encoder entity linking architecture☆50Updated 11 months ago
- A fast python implementation of the SimHash algorithm.☆27Updated 3 years ago
- A Streamlit component for annotating text by text selecting.☆40Updated last year
- The dataset contains 3 million attribute-value annotations across 1257 unique categories on 2.2 million cleaned Amazon product profiles. …☆147Updated 2 years ago
- We identify the desiderata for a comprehensive benchmark and propose Visually Rich Document Understanding (VRDU). VRDU contains two datas…☆80Updated 2 years ago
- RaKUn 2.0 - A fast keyword detection algorithm☆68Updated 3 weeks ago
- Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.☆105Updated 3 years ago
- Code for Relevance-guided Supervision for OpenQA with ColBERT (TACL'21)☆41Updated 4 years ago
- ☆43Updated 2 years ago