mk-fg / image-deduplication-tool
Tool to detect (and get rid of) similar images using perceptual hashing (pHash lib)
☆81Updated 8 years ago
Related projects ⓘ
Alternatives and complementary repositories for image-deduplication-tool
- Tool for managing data-deduplication within extant compressed archive files, along with a relatively performant BK tree implementation fo…☆98Updated last year
- (Note: This repository is obsolete, please see the new Browsertrix webrecorder/browsertrix) Browser-Based On-Demand Web Archiving Automat…☆39Updated 5 years ago
- Video fingerprinting tool. Finding duplicate movies in a large dataset.☆43Updated 11 years ago
- Attempt to use perceptual hash (pHash) to segment a video into "scenes" very quickly (Normally under a minute for hour long HD videos).☆46Updated last year
- Web archiving using Google Chrome☆42Updated 4 years ago
- Serving content from a WARC☆60Updated 11 years ago
- WarcMiddleware lets users seamlessly download a mirror copy of a website when running a web crawl with the Python web crawler Scrapy.☆44Updated 6 years ago
- Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.☆42Updated 6 years ago
- python module for indexing tar files for fast access☆74Updated 9 years ago
- JPEG Optimization☆55Updated 10 years ago
- Detecting near-duplicate videos by aggregating features from intermediate CNN layers☆95Updated 6 years ago
- Sort-friendly URI Reordering Transform (SURT) python module☆40Updated 3 months ago
- 📂🛡️Suite of tools for file fixity (data protection for long term storage⌛) using redundant error correcting codes, hash auditing and du…☆133Updated 2 months ago
- 🧠 AI powered image tagger backed by DeepDetect☆243Updated 6 years ago
- jpeginfo - prints information and tests integrity of JPEG/JFIF files☆142Updated last year
- Crops images using facial recognition from a webcam or a locally saved image☆17Updated 10 years ago
- Esper instance for TV news analysis☆39Updated last year
- 🏞🌉 Find and Delete Duplicate Images / Photos☆84Updated 2 months ago
- Grabbing all news.☆62Updated 4 years ago
- Define simple search patterns in bulk to perform advanced matching on any string☆55Updated 10 months ago
- A backup program that does deduplication, compression, encryption☆28Updated 3 years ago
- A Polymer-based video tagging control.☆20Updated 7 years ago
- A Memento Aggregator CLI and Server in Go☆57Updated 5 months ago
- Python library for reading and writing warc files☆237Updated 2 years ago
- Suite of tools for detecting changes in web pages and their rendering☆53Updated 10 months ago
- Tool to download thumbnails of files from Wikimedia Commons☆28Updated 3 years ago
- Perceptual Hash project for Videos (MMAI Term Project)☆27Updated 10 years ago
- A comparison of ffmpeg, Shotdetect and PySceneDetect for shot transition detection☆117Updated 6 years ago
- An HTTP Proxy that archives all intercepted traffic.☆21Updated 10 years ago
- Rewriting web proxy and archival tool. At this point, it just tries to download all the things.☆199Updated this week