trying shingling / resemblance / simhash / sketching to do some data deduping
☆97Aug 21, 2015Updated 10 years ago
Alternatives and similar repositories for resemblance
Users that are interested in resemblance are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- S3 like FileSystem based on Cassandra☆33May 14, 2010Updated 15 years ago
- A toy school project intended to be an approximate clone of Google's Megastore database for geographically-distributed scalable fault-to…☆35Oct 12, 2011Updated 14 years ago
- NEW: see http://www.hops.io/. OLD: This work aims to re-engineer the Hadoop Distributed File System (HDFS) so that it can be 1) highly av…☆26Jan 2, 2012Updated 14 years ago
- Service application for storing and invoking delayed HTTP callbacks☆38Mar 24, 2012Updated 14 years ago
- Lightning fast URL routing in Python (radix-trie router)☆23Jul 16, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Ruby client library for controlling Google Refine☆44May 4, 2018Updated 7 years ago
- ☆15Mar 7, 2018Updated 8 years ago
- A command-line benchmarking tool to measure the startup times of programs in various languages☆14Oct 17, 2020Updated 5 years ago
- Browser Map-Reduce: distributed word count example☆33Jun 3, 2011Updated 14 years ago
- Fast and tasty server side cookies handling☆10Feb 26, 2024Updated 2 years ago
- Ruby interface to sendfile(2) system call☆33Mar 4, 2026Updated last month
- Unobtrusive realtime mouse-tracking analytics for node.js☆18Nov 22, 2015Updated 10 years ago
- Dazzling is a project website generator based on Gatsby and React that's simple, quick, and extensible.☆10Sep 11, 2018Updated 7 years ago
- DIY outdoor solar harvester.☆23Feb 18, 2020Updated 6 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- ☆11Mar 30, 2016Updated 10 years ago
- A bunch of distance measures☆31Feb 6, 2016Updated 10 years ago
- An Efficient Distributed File/Blob/Key-Value Store for Billions of Small Files☆121Feb 20, 2013Updated 13 years ago
- The fat module is Python 3.6 extension module (written in C) implementing fast guards for specialized functions☆19Jul 29, 2019Updated 6 years ago
- A fast, sandboxed micro matching engine with serializable rules.☆13Feb 5, 2018Updated 8 years ago
- Redis-backed autocomplete for Django models☆17Nov 8, 2022Updated 3 years ago
- Torrent for the less privileged☆13Mar 1, 2011Updated 15 years ago
- ☆14Feb 22, 2014Updated 12 years ago
- ☆18Jan 21, 2021Updated 5 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- helpful utils for working with PySide☆28Feb 25, 2018Updated 8 years ago
- AllenNLP model for the Kaggle toxic comments challenge☆31Jul 13, 2018Updated 7 years ago
- A weak reference implementation for Ruby that works across runtimes (MRI, REE, Jruby, Rubinius, and IronRuby)☆65Nov 17, 2020Updated 5 years ago
- Learning M-Way Tree - Web Scale Clustering - EM-tree, K-tree, k-means, TSVQ, repeated k-means, bitwise clustering☆78Feb 7, 2022Updated 4 years ago
- A Rust crate offering similar functionality to the Python transformers package using Candle.☆14Nov 19, 2024Updated last year
- Routes for speed.☆16Jan 2, 2025Updated last year
- Inline annotation for the web in pure Javascript. Select text, images, or (nearly) anything else, and add your notes.☆10May 2, 2016Updated 9 years ago
- This extension bundles Hygen into VSCode and offers seamless code generator functionality right into your editor.☆20Sep 14, 2018Updated 7 years ago
- Computer Vision GPU acceleration with OpenCL☆15Sep 1, 2015Updated 10 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Because you're computing conversion rates wrong☆16May 23, 2017Updated 8 years ago
- Another ruby/rails bloat and memory leak debugging tool.☆36Feb 24, 2013Updated 13 years ago
- Mneme is an HTTP web-service for recording and identifying previously seen records - aka, duplicate detection.☆108Jun 30, 2013Updated 12 years ago
- Data, code, and images for a posting summarizing three studies about pie charts☆15Jul 12, 2016Updated 9 years ago
- node.js websocket interface to redis pub/sub message bus☆14Jun 15, 2011Updated 14 years ago
- A .env parser/loader improved for performance.☆30Nov 12, 2024Updated last year
- Data on international first names and sex of people with that name☆13Jan 12, 2019Updated 7 years ago