mk-fg / image-deduplication-toolLinks
Tool to detect (and get rid of) similar images using perceptual hashing (pHash lib)
☆82Updated 8 years ago
Alternatives and similar repositories for image-deduplication-tool
Users that are interested in image-deduplication-tool are comparing it to the libraries listed below
Sorting:
- WarcMiddleware lets users seamlessly download a mirror copy of a website when running a web crawl with the Python web crawler Scrapy.☆47Updated 7 years ago
- Short script for removing watermarks from PDF files. Requires pdftk.☆59Updated 6 years ago
- Google Chrome Extension. Record All Browsing in Screenshots & Full Text. Search For Anything At Any Time. Never Forget Where You Read Som…☆308Updated 7 years ago
- Implementation of perceptual image hash calculation in Python☆132Updated last year
- A Python Perceptual Image Hashing Module☆213Updated 2 years ago
- Tool for managing data-deduplication within extant compressed archive files, along with a relatively performant BK tree implementation fo…☆102Updated last year
- Serving content from a WARC☆62Updated 12 years ago
- A wrapper for tesseract / abbyyOCR11 ocr4linux finereader cli that can perform batch operations or monitor a directory and launch an OCR …☆65Updated last year
- A simple Python wrapper for the archive.is capturing service☆203Updated 5 months ago
- An eBook tool to extract ISBN or Metadata form eBook and rename them by using ISBN database and Metadata☆30Updated 10 years ago
- unified cli for various saas image classification apis.☆40Updated 7 years ago
- Check out https://github.com/webrecorder/webrecorder for newer version matching https://webrecorder.io☆38Updated 9 years ago
- 🧠 AI powered image tagger backed by DeepDetect☆247Updated 6 years ago
- Scrapy middleware which allows to crawl only new content☆80Updated 2 years ago
- Scrapy YouTube watch history spider. Because YouTube didn't have a history search.☆155Updated 2 years ago
- Scraping assistant tool. Editing and maintaining CSS/XPath selectors across webpages.☆104Updated 7 years ago
- Rename images using deep learning☆153Updated 2 years ago
- A python library for the Tiny Tiny RSS web API☆56Updated 4 years ago
- Tag-based bookmark manager inspired by delicious and Pinboard☆34Updated 2 years ago
- Automatic video summaries☆264Updated 7 years ago
- Convert text documents to high fidelity audio(books).☆205Updated 5 years ago
- a command-line web scraping tool☆151Updated 2 years ago
- A python script to download books from libgen.io☆75Updated 6 years ago
- Esper instance for TV news analysis☆40Updated 2 years ago
- WARC writing MITM HTTP/S proxy☆415Updated this week
- DIY Atom feeds in times of social media and paywalls☆84Updated last year
- Automatically extracts and normalizes an online article or blog post publication date☆117Updated last year
- A Python library that provides an api to search and get links from Books,Magazines,Comics,... from Library Genesis.☆121Updated 3 years ago
- A python autocompletion library. Easycomplete has a simple API and utilizes google's autocomplete results & the english dictionary for no…☆40Updated 11 years ago
- A company/project name generator for Python. Uses NLTK and diverse techniques derived from existing corporate etymologies and naming agen…☆50Updated 8 years ago