Detecting near duplicates usign Moses Charikars Algorithm
☆20Oct 7, 2014Updated 11 years ago
Alternatives and similar repositories for charikars_algorithm
Users that are interested in charikars_algorithm are comparing it to the libraries listed below
Sorting:
- regex powered yank+substitute☆13Oct 23, 2017Updated 8 years ago
- ... just because nltk is too heavy☆35Jul 21, 2010Updated 15 years ago
- Near-duplicate detection tool☆24Nov 27, 2016Updated 9 years ago
- Python API for Various DB-Backed Simhash Clusters☆64Mar 16, 2017Updated 8 years ago
- Automatic .gif creation from Youtube videos!☆56Dec 5, 2014Updated 11 years ago
- collection of modules to build distributed and reliable concurrent systems in Python.☆206Sep 14, 2013Updated 12 years ago
- A system to track the status of healthcare providers during and after a disaster.☆13Mar 4, 2023Updated 2 years ago
- Python bindings for the NVML. Non-volatile memory for Python.☆12May 23, 2016Updated 9 years ago
- vertical search crawler☆38Jan 9, 2012Updated 14 years ago
- ☆18Jul 23, 2016Updated 9 years ago
- Brand disambiguator for tweets to differentiate e.g. Orange vs orange (brand vs foodstuff), using NLTK and scikit-learn☆58Jul 11, 2013Updated 12 years ago
- Easing for linear videos or image sequences☆14Aug 30, 2015Updated 10 years ago
- A Clojure library that determines the similarity of 2 or more sentences.☆10Jan 25, 2016Updated 10 years ago
- Show common areas of bike accidents to help prevent future accidents☆11Oct 18, 2017Updated 8 years ago
- Tools to make Supernote devices even more super☆22Jan 12, 2026Updated last month
- Skip text in INSERT mode☆11Nov 27, 2018Updated 7 years ago
- pygifme is a simple command line tool to generate animated GIFs. It is a python port from the original ruby script gifme created by [@ho…☆25Aug 31, 2021Updated 4 years ago
- Ansible role for ntp☆12Dec 4, 2025Updated 2 months ago
- Make music while you program.☆43Apr 11, 2014Updated 11 years ago
- neuralpy - neural network library written in python☆12Jun 25, 2023Updated 2 years ago
- Web based semantic visualization tool☆12Feb 16, 2017Updated 9 years ago
- run multiple shell commands in parallel and coordinate their output☆31Jul 5, 2012Updated 13 years ago
- Web content transformation proxies for open data API's☆16Dec 14, 2022Updated 3 years ago
- The CHIRP Radio Machine: management of our music library and broadcast stream☆13Feb 20, 2026Updated last week
- Natural language hashing library.☆10Nov 24, 2014Updated 11 years ago
- ☆14Aug 21, 2021Updated 4 years ago
- A simple Redis-based job queue written in C☆34Dec 18, 2009Updated 16 years ago
- Machine Learning solution for Kaggle.com's "Partly Sunny with a Chance of Hashtags"☆27Dec 6, 2013Updated 12 years ago
- A Persian Word2Vec Model trained by Wikipedia articles☆10Jan 5, 2018Updated 8 years ago
- Socks5man is a Socks5 management tool and Python library☆12Mar 10, 2023Updated 2 years ago
- The Tweets2013 Internet Archive collection☆10Aug 7, 2020Updated 5 years ago
- Find, list, filter, and edit or otherwise act on files from the terminal using fuzzy-matching or regular expressions.☆26May 27, 2018Updated 7 years ago
- 🤖📝 CLI tool using LLMs to apply a change to a list of files while considering the context of each file.☆11Nov 28, 2024Updated last year
- Flexibly analyze text for profanity, racial slurs, and sexual words.☆19Aug 19, 2011Updated 14 years ago
- Simple Flask webservice to search through your PDF collection using Whoosh☆11Jul 11, 2014Updated 11 years ago
- A script for creating elixir image for phoenix framework.☆10Dec 15, 2023Updated 2 years ago
- Lecture material (slides, script) for a distributed systems Bachelor class☆12Jul 31, 2024Updated last year
- Python port of Boilerpipe library☆16Apr 6, 2018Updated 7 years ago
- JSON processing command line tool based on JSONSelect (CSS-like selectors for JSON)☆43Sep 28, 2015Updated 10 years ago