Python API for Various DB-Backed Simhash Clusters
☆64Mar 16, 2017Updated 9 years ago
Alternatives and similar repositories for simhash-db-py
Users that are interested in simhash-db-py are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Simhash and near-duplicate detection☆422May 15, 2023Updated 3 years ago
- A cluster implementation of simhash near-duplicate detection☆32Mar 11, 2015Updated 11 years ago
- common data interchange format for document processing pipelines that apply natural language processing tools to large streams of text☆35Sep 30, 2016Updated 9 years ago
- A project for clustering text streams using locality-sensitive hashing (LSH) in Python☆26Sep 23, 2011Updated 14 years ago
- mltk - Moz Language Tool Kit☆12Mar 6, 2015Updated 11 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A library to find the percentage of similarity between two given strings (can be expanded to compare every thing!).☆46Jul 30, 2013Updated 12 years ago
- Detecting near duplicates usign Moses Charikars Algorithm☆20Apr 27, 2026Updated last month
- URL Transformation, Sanitization☆104Jan 16, 2024Updated 2 years ago
- A platform for collecting, analyzing, and visualizing social media data.☆13Dec 27, 2020Updated 5 years ago
- A simple implementation of simhash algorithm by java.☆155Oct 10, 2020Updated 5 years ago
- A Python Implementation of Simhash Algorithm☆1,037Mar 24, 2022Updated 4 years ago
- Elasticsearch plugin for b-bit minhash algorism☆65Jun 17, 2024Updated last year
- A Text Comprehension Engine in Python☆15Aug 23, 2015Updated 10 years ago
- simple simhashing in hadoop with cascading☆33May 9, 2011Updated 15 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Distributed text analysis suite based on Celery☆96Dec 15, 2022Updated 3 years ago
- Crisis Event Extraction Service (CREES)☆15Feb 4, 2019Updated 7 years ago
- Open Source Implementation of Simhash in Python☆24Sep 14, 2017Updated 8 years ago
- Slinky, a high-performance web crawler / text analytics in Python, Redis, Hadoop, R, Gephi☆40Aug 30, 2010Updated 15 years ago
- The OpenCitations RDF Resource Browser☆15Oct 29, 2025Updated 7 months ago
- All ontologies used in NIF 2.0 (NIF-Core + vocabulary modules + helper modules)☆38Jun 22, 2017Updated 8 years ago
- Deprecated, see https://github.com/TriplyDB/Yasgui for the Yasgui monorepo☆26Jan 12, 2020Updated 6 years ago
- A simple server to publish nanopublications☆12May 15, 2024Updated 2 years ago
- A streaming cross-cat inference engine☆20Mar 27, 2024Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Non-local Modeling for Image Quality Assessment☆13Dec 20, 2023Updated 2 years ago
- Mirror of 0.1.1 release of clausie from http://www.mpi-inf.mpg.de/departments/databases-and-information-systems/software/clausie/☆14Jan 4, 2015Updated 11 years ago
- UnifiedViews☆29Apr 11, 2022Updated 4 years ago
- A ROS1/ROS2 compatible, RDFlib-backed knowledge base for robotic application. Mostly KB-API conformant.☆16Apr 2, 2026Updated 2 months ago
- Topic Model or LDA in Cython☆21Apr 9, 2011Updated 15 years ago
- Public SPARQL Endpoint Service Monitoring☆11Jan 7, 2025Updated last year
- This repository hosts the documentation for the mixminion protocol☆11May 26, 2010Updated 16 years ago
- Locality Sensitive Hashing using MinHash in Python/Cython to detect near duplicate text documents☆290Jun 11, 2023Updated 3 years ago
- XMPP chatbot written in Python. Makes use of PyAIML☆27Feb 17, 2023Updated 3 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- 天池比赛☆10Jul 4, 2021Updated 4 years ago
- We attempt to do few shot learning with BERT and prototypical network for Intent classification☆22Jun 27, 2020Updated 5 years ago
- Statistical Natural Language Processing with Annotated Suffix Trees☆22Jul 22, 2016Updated 9 years ago
- Learn handwriting using RNN☆65Jul 8, 2015Updated 10 years ago
- A simple and fast search engine☆70Jun 21, 2022Updated 3 years ago
- Chinese word segmentation with the neural seq2seq model implement in pytorch☆10Dec 13, 2017Updated 8 years ago
- a minimum demo web framework based on servlet☆10Sep 3, 2015Updated 10 years ago