erdogant / undoubleLinks
Python package undouble is to detect (near-)identical images.
☆50Updated 2 months ago
Alternatives and similar repositories for undouble
Users that are interested in undouble are comparing it to the libraries listed below
Sorting:
- Python package for deduplication/entity resolution using active learning☆81Updated 10 months ago
- Clean, filter and sample URLs to optimize data collection – Python & command-line – Deduplication, spam, content and language filters☆142Updated 6 months ago
- Python package to generate image embeddings with CLIP without PyTorch/TensorFlow☆152Updated 3 years ago
- Input text or image, get back matching image fashion results, using Jina, DocArray, and CLIP☆50Updated 2 years ago
- It's a cooler way to store simple linear models.☆27Updated last year
- Hybrid architecture media server, media service and Streamlit client app using FastAPI and Python☆13Updated 3 years ago
- MultiOCR, an interface that connects multiple open-source OCR and various Cloud OCR.☆31Updated last year
- Fast Near-Duplicate Image Search and Delete using pHash, t-SNE and KDTree.☆159Updated 2 years ago
- 🤝 Trade any tensors over the network☆30Updated last year
- A Jupyter notebook for visualizing a user's top artist / track data in Spotify.☆10Updated 3 years ago
- 🔤 Measure edit distance based on keyboard layout☆60Updated last year
- EDS-PDF is a generic, pure-Python framework for text extraction from PDF documents. It provides the machinery to use rule- or machine-lea…☆51Updated 5 months ago
- 🖍️ Highlight text in documents☆109Updated 2 months ago
- Deidentify people's names and gender specific pronouns☆37Updated 2 months ago
- Playing with Python Bluesky SDK☆15Updated 8 months ago
- A Jupyter widget for annotating images with bounding boxes☆135Updated 10 months ago
- OCR, Archive, Index and Search: Implementation agnostic OCR framework.☆222Updated last year
- 🚂 Fine-tune OpenAI models for text classification, question answering, and more☆16Updated 2 years ago
- Complex data extraction and orchestration framework designed for processing unstructured documents. It integrates AI-powered document pip…☆70Updated this week
- Blazing fast fuzzy text search for Python.☆45Updated 2 months ago
- An open-source package for python to clean raw text data☆70Updated last year
- ☆50Updated 7 months ago
- An AI extension for IPython that makes it work like Cursor☆67Updated 6 months ago
- The largest multilingual image-text classification dataset. It contains fashion products.☆72Updated 2 years ago
- Extract knowledge from raw text☆13Updated 3 years ago
- A Python Perceptual Image Hashing Module☆21Updated 9 years ago
- Image captioning ready-to-go inference: show and tell model compatible with Tensorflow r1.9☆94Updated 2 years ago
- Boolean text search in Python☆45Updated 3 weeks ago
- 💥 Use Hugging Face text and token classification pipelines directly in spaCy☆63Updated last year
- Effective frame sampling for ML applications.☆20Updated 2 months ago