This provides tools for b-bit MinHash algorism.
☆38Nov 21, 2025Updated 4 months ago
Alternatives and similar repositories for minhash
Users that are interested in minhash are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Elasticsearch plugin for b-bit minhash algorism☆63Jun 17, 2024Updated last year
- Java implementation for MinHash and LSH for finding near duplicate documents as measured by Jaccard similarity.☆32Mar 30, 2015Updated 10 years ago
- Vowpal Wabbit Webservice. A web service that accepts VW formatted text and runs it through a VW daemon instance.☆40Mar 9, 2016Updated 10 years ago
- A Java library implementing practical nearest neighbour search algorithm for multidimensional vectors that operates in sublinear time. It…☆202Jul 26, 2020Updated 5 years ago
- Web archiving utility library☆11Mar 11, 2026Updated 2 weeks ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- ☆11Sep 29, 2017Updated 8 years ago
- HTML parser and tag balancer.☆19Updated this week
- Java port of c++ version of facebook fasttext☆15Oct 14, 2019Updated 6 years ago
- Parquet IO for Tablesaw☆12Mar 2, 2026Updated 3 weeks ago
- A simple implementation of simhash algorithm by java.☆154Oct 10, 2020Updated 5 years ago
- An Objective-C implementation of a centred interval tree.☆17Jan 23, 2016Updated 10 years ago
- Go template implementation in Java☆15May 18, 2025Updated 10 months ago
- Open source Java framework to create, process and manage mixtures of exponential family☆14Aug 4, 2015Updated 10 years ago
- A copy of the source for Grinstead and Snell's lovely probability book☆14Dec 20, 2015Updated 10 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- ☆18Jan 21, 2021Updated 5 years ago
- IPython Notebook for Sentiment Classification☆10Nov 12, 2014Updated 11 years ago
- MirrorDataGenerator is a python tool that generates synthetic data based on user-specified causal relations among features in the data. I…☆24Jun 22, 2022Updated 3 years ago
- ☆17Jul 21, 2022Updated 3 years ago
- This project defines a json ontology standard describing a power consumption measure in a given software/hardware context, noticeably in …☆15Mar 2, 2026Updated 3 weeks ago
- Arxiv crawler written in python☆13Jun 17, 2012Updated 13 years ago
- Data Science Course Materials - Fall 2014☆12Sep 6, 2014Updated 11 years ago
- A Java library for Stochastic Gradient Descent (SGD)☆22Nov 1, 2021Updated 4 years ago
- SubEtha Mail is a J2EE-based mailing list manager☆14May 17, 2015Updated 10 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Code repository for Mondrian, a project for multiregion template recognition in spreadsheets.☆14May 25, 2022Updated 3 years ago
- Python library to get the Alexa rank of the domain of any URL☆10Jan 28, 2013Updated 13 years ago
- ☆25Jul 12, 2017Updated 8 years ago
- Index of URLs to pdf files all over the internet and scripts☆25May 2, 2023Updated 2 years ago
- Hal Daume's hbc☆20Jan 23, 2010Updated 16 years ago
- Alpha-AutoML is a Python library for automatically generating end-to-end machine learning pipelines.☆23Dec 17, 2024Updated last year
- Seed acquisition tool to bootstrap focused crawlers☆23Apr 24, 2017Updated 8 years ago
- Java port of Facebook's PlanOut A/B testing system with additional functionality☆10Dec 27, 2018Updated 7 years ago
- JAVA implementation of Multinomial Naive Bayes Text Classifier.☆97Oct 18, 2014Updated 11 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- The bare necessities of Pandas on the Weld runtime☆14Dec 26, 2022Updated 3 years ago
- Multi-Engine is a Java framework for distributed parallel processing, whose kernel is Multi-Task.☆15Feb 4, 2017Updated 9 years ago
- A STOMP Websocket client written in Swift using SocketRocket to communicate over WebSocket. At the moment I only support sending and rece…☆11Sep 20, 2019Updated 6 years ago
- phonegap app for hailing finding / finding passengers (simple version of uber)☆15Dec 16, 2015Updated 10 years ago
- almost, but not quite, entirely unlike core.async☆18Dec 13, 2017Updated 8 years ago
- QPBO interface and alpha expansion for Python☆24Nov 3, 2022Updated 3 years ago
- Professional fast kline for Android client☆15Oct 30, 2019Updated 6 years ago