A fast python implementation of the SimHash algorithm.
☆27Oct 27, 2021Updated 4 years ago
Alternatives and similar repositories for floc-simhash
Users that are interested in floc-simhash are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Visual Hash for matching copies of visually similar images.☆16Mar 17, 2025Updated last year
- [WWW 2026] 🕸 GlotWeb: Web Indexing for Minority Languages☆17Apr 14, 2026Updated 2 months ago
- Rust implementation of probminhash, superminhash and hyperloglog sketching algorithms☆31Jan 22, 2026Updated 4 months ago
- FLoC Simulator☆37Aug 10, 2021Updated 4 years ago
- Add screenshot button to youtube.com☆15Jun 22, 2018Updated 7 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Basis for constructing a new project on top of mu.semte.ch☆16Jun 5, 2026Updated last week
- Scrape and structure raw data from the Norwegian parliament's API.☆12Oct 24, 2025Updated 7 months ago
- Python wrapper for phonetisaurus grapheme to phoneme tool☆12Mar 11, 2021Updated 5 years ago
- Translation of query languages to serialized KoralQuery protocol☆15Jun 4, 2026Updated last week
- Efficient batch-detection of audio sample matching (kind of like shazam, but more involved)☆10Mar 11, 2015Updated 11 years ago
- Small collection of PAGE XML related scripts used at the ZPD Würzburg☆12Aug 2, 2024Updated last year
- Benchmark scripts for comparing different tokenizers and sentence segmenters of German☆12Feb 27, 2023Updated 3 years ago
- Implementation of the Tower Method, a novel approach to handling missing values.☆13Mar 12, 2024Updated 2 years ago
- semantic-sh is a SimHash implementation to detect and group similar texts by taking power of word vectors and transformer-based language …☆27Jul 25, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆19May 6, 2026Updated last month
- Obsidian plugin for managing aliases for Johnny Decimal-indexed notes☆16Aug 10, 2021Updated 4 years ago
- code and data used to build a training dataset for dragnet models☆10Nov 29, 2020Updated 5 years ago
- Dancing with AI Curriculum Website!☆16Apr 8, 2026Updated 2 months ago
- Basis of FragDenStaat.de's „Koalitionstracker“☆15Jul 14, 2025Updated 11 months ago
- Small string compression using smaz compression algorithm. Fast, because it's in C. Supports Python 3+☆13Oct 18, 2025Updated 7 months ago
- ProbMinHash – A Class of Locality-Sensitive Hash Algorithms for the (Probability) Jaccard Similarity☆44Oct 26, 2020Updated 5 years ago
- Support for writing WARC files with Scrapy☆24Dec 21, 2019Updated 6 years ago
- The Wikinflection Corpus, from the paper "Wikinflection Corpus: A (Better) Multilingual, Morpheme-Annotated Inflectional Corpus" (Metheni…☆12Dec 15, 2023Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Benson turns a list of URLs into mp3s of the contents of each web page - take control over your reading backlog!☆16Oct 30, 2024Updated last year
- ☆16May 11, 2021Updated 5 years ago
- A reddit bot that finds original publish dates on linked articles.☆10Nov 30, 2024Updated last year
- SMOR (Stuttgart Morphology) with alternative lemmatization component☆13Aug 10, 2023Updated 2 years ago
- AfroLID, a powerful neural toolkit for African languages identification which covers 517 African languages.☆39Feb 5, 2026Updated 4 months ago
- Distributed k-nearest Neighbors using Locality Sensitive Hashing and SYCL☆10Updated this week
- An LL parser for extracting information from Wiki text, particularly Wiktionary.☆51Aug 16, 2023Updated 2 years ago
- A trend viewer written in Python/JavaScript☆21Nov 15, 2024Updated last year
- Open Source MCP for Rekordbox DJ☆64Apr 15, 2026Updated 2 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Example configurations for the Community Solid Server☆22Mar 9, 2026Updated 3 months ago
- Ember addon wrapping an RDFa editor with a public API☆18Updated this week
- A Python scraping module, that extracts text from articles found in RSS feeds. Uses SQLite as database.☆20Jul 5, 2024Updated last year
- Implementation of COO, CSR, CSC, SSS and TJDS sparse matrix formats.☆11Jul 15, 2015Updated 10 years ago
- Evaluate language models using multiple choice items☆13Mar 6, 2026Updated 3 months ago
- CLI to normalize and organize your files based on customizable rules.☆24Jun 2, 2026Updated last week
- Data Lineage Tracing Library☆24Nov 30, 2021Updated 4 years ago