This provides tools for b-bit MinHash algorism.
☆38Nov 21, 2025Updated 3 months ago
Alternatives and similar repositories for minhash
Users that are interested in minhash are comparing it to the libraries listed below
Sorting:
- Elasticsearch plugin for b-bit minhash algorism☆63Jun 17, 2024Updated last year
- Easy-to-use Java library for similarity checking of strings or numeric-series☆20Jan 23, 2020Updated 6 years ago
- A Java implementation of Locality Sensitive Hashing (LSH)☆301Nov 19, 2022Updated 3 years ago
- Java implementation for MinHash and LSH for finding near duplicate documents as measured by Jaccard similarity.☆32Mar 30, 2015Updated 10 years ago
- Java port of c++ version of facebook fasttext☆15Oct 14, 2019Updated 6 years ago
- Introducing Filtered Direct Preference Optimization (fDPO) that enhances language model alignment with human preferences by discarding lo…☆16Nov 27, 2024Updated last year
- ☆25Jul 12, 2017Updated 8 years ago
- A Java library implementing practical nearest neighbour search algorithm for multidimensional vectors that operates in sublinear time. It…☆202Jul 26, 2020Updated 5 years ago
- Weighted MinHash implementation on CUDA (multi-gpu).☆121Nov 29, 2023Updated 2 years ago
- Big Data Analysis of Tinder done at Universitat Rovira i Virgili and Universitat Politècnica de Catalunya · BarcelonaTech☆13Jan 3, 2023Updated 3 years ago
- bootstrap式知识三元组抽取 开放式实体关系抽取 依靠依存分析确定可能的实体和关系☆23Feb 20, 2019Updated 7 years ago
- A tool to paste Excel ranges to Reddit☆11Sep 20, 2025Updated 5 months ago
- Understanding the correlation between different LLM benchmarks☆29Jan 11, 2024Updated 2 years ago
- ADMM on Apache Spark☆31Jul 21, 2015Updated 10 years ago
- Multiprocessing in python☆10Aug 20, 2021Updated 4 years ago
- Our data munging code.☆34Oct 13, 2025Updated 4 months ago
- YouTube Assistant☆12May 15, 2023Updated 2 years ago
- A modern technical analysis library for Kotlin☆10Jan 20, 2023Updated 3 years ago
- An implementation of MSSRM method☆11Mar 23, 2023Updated 2 years ago
- ☆11Jul 20, 2021Updated 4 years ago
- Inspirational post ids collected from Reddit using pushift.io and RoBERTa☆10Jan 18, 2024Updated 2 years ago
- DNH Werewolf Discord bot☆13Dec 19, 2024Updated last year
- 基于人工神经网络的中文语义相似度计算研究☆11Apr 1, 2013Updated 12 years ago
- Code for Beyond Generic: Enhancing Image Captioning with Real-World Knowledge using Vision-Language Pre-Training Model☆13Feb 15, 2024Updated 2 years ago
- Open-source Human Feedback Library☆11Oct 25, 2023Updated 2 years ago
- 记录有用的Git repos☆12Jul 28, 2024Updated last year
- 小鸡词典🐤的Alfred🎩插件 咯咯咯☆11Apr 19, 2023Updated 2 years ago
- 李鲁鲁老师的 Copilot-Python 学习。和ChatGPT等大语言模型协同进化 。☆10Jun 3, 2025Updated 9 months ago
- A Toolkit for Fine-Tuning Large Language Models with LoRA and DeepSpeed☆11Apr 14, 2023Updated 2 years ago
- A tool for converting FERC filings published in XBRL into SQLite databases☆15Feb 24, 2026Updated last week
- This is a prototype app that store items into a Hazelcast map and queue based on the description in https://wiki.mozilla.org/Socorro:Clie…☆17Apr 11, 2011Updated 14 years ago
- ☆11Mar 23, 2025Updated 11 months ago
- ☆12Oct 5, 2022Updated 3 years ago
- distilled Self-Critique refines the outputs of a LLM with only synthetic data☆11Apr 11, 2024Updated last year
- ☆13Jan 17, 2023Updated 3 years ago
- ☆10Aug 25, 2020Updated 5 years ago
- I have fineTuned FinBert Model on 4.9k Financial News Headlines, Got 81-82% ACC and it perfrom well in Financial Stock News Sentiment Ana…☆12Mar 4, 2024Updated 2 years ago
- ☆11Jun 5, 2024Updated last year
- Background materials for the article "Productivity Assessment of Neural Code Completion"☆13Jul 11, 2023Updated 2 years ago