Find near-duplicate documents using minhashing implemented in Go.
☆16Dec 22, 2015Updated 10 years ago
Alternatives and similar repositories for deduper
Users that are interested in deduper are comparing it to the libraries listed below
Sorting:
- Remove duplicate documents/videos/images via popular algorithms such as SimHash, SpotSig, Shingling, etc.☆19Aug 28, 2023Updated 2 years ago
- Get a list of deduped files on a ZFS filesystem☆13Oct 14, 2020Updated 5 years ago
- Easy handling of memory-mapped files☆22Mar 28, 2014Updated 11 years ago
- Text classifier for Go, aka document categorization.☆41Nov 27, 2015Updated 10 years ago
- Suggester - the heart for full-text auto-complete web services☆29Jul 8, 2014Updated 11 years ago
- ☆26Nov 9, 2016Updated 9 years ago
- A simhasher for Chinese documents implemented by golang, simply translated from yanyiwu/gosimhash☆17Nov 30, 2017Updated 8 years ago
- A high performance lock free map type for go.☆20Apr 19, 2018Updated 7 years ago
- Blockhash perceptual-hash algorithm for images. Written in pure Go.☆22Aug 4, 2020Updated 5 years ago
- Fuzzy text searching like Sublime Text☆26Sep 10, 2015Updated 10 years ago
- Repository for the CLiPS HAte speech DEtection System [HADES].☆24Apr 5, 2018Updated 7 years ago
- Utilities for extracting and compressing tgz and zip files.☆28Updated this week
- gzip indexer for random access into compressed files☆30Jan 4, 2018Updated 8 years ago
- Self-organizing maps in Go☆74May 28, 2022Updated 3 years ago
- ☆32Sep 21, 2017Updated 8 years ago
- 【Android 11-13】为移动热点设置静态 IP☆10Mar 5, 2024Updated 2 years ago
- Disable Target API Block☆26Oct 18, 2025Updated 4 months ago
- An implementation of Count-Min Sketch in Golang☆37Dec 3, 2024Updated last year
- readahead is a package that provides readers that enable concurrent reads from seekable or compressed files☆136Dec 22, 2016Updated 9 years ago
- C++ rewrite of PPPwn (PlayStation 4 PPPoE RCE)☆10Feb 27, 2025Updated last year
- Stor2rrd Grafan monitoring☆12Jan 8, 2019Updated 7 years ago
- golang package to provide lightweight internal pub/sub for goroutines☆29Jan 23, 2014Updated 12 years ago
- Automatic .gif creation from Youtube videos!☆56Dec 5, 2014Updated 11 years ago
- Generates a YouTube playlist from a list of URLs.☆10Aug 14, 2023Updated 2 years ago
- Acode - powerful text/code editor for android☆28Feb 19, 2026Updated 2 weeks ago
- vertical search crawler☆38Jan 9, 2012Updated 14 years ago
- personal synchronization application - based on git☆17Apr 6, 2012Updated 13 years ago
- Object Detection for Video Games!☆12Jul 18, 2021Updated 4 years ago
- A Go library for specialized integer hash maps.☆11Sep 15, 2016Updated 9 years ago
- This project contains simple methods to measure sample relatedness and identify potential swaps and contamination☆10Jan 8, 2016Updated 10 years ago
- Gootool for Android☆13Jul 21, 2023Updated 2 years ago
- ArchiveWeb.page Express!☆14Nov 1, 2024Updated last year
- collection of modules to build distributed and reliable concurrent systems in Python.☆206Sep 14, 2013Updated 12 years ago
- Dedup and compress your device mapper devices. Works especially well with thin provisioning.☆10Dec 4, 2025Updated 3 months ago
- A simple shell script with wizard to get you OpenWRT for Proxmox.☆11Oct 16, 2021Updated 4 years ago
- Python bindings for the NVML. Non-volatile memory for Python.☆12May 23, 2016Updated 9 years ago
- ☆18Jul 23, 2016Updated 9 years ago
- 一个可以自己进行训练的中文聊天机器人, 根据自己的语料训练出自己想要的聊天机器人,可以用于智能客服、在线问答、智能聊天等场景。目前包含seq2seq、seqGAN版本和tf2.0版本。☆11Feb 10, 2021Updated 5 years ago
- Reduce frustration, improve outcomes, and save money on Claude Code CLI, Gemini, and Codex☆21Oct 31, 2025Updated 4 months ago