A simple implementation of simhash algorithm by java.
☆154Oct 10, 2020Updated 5 years ago
Alternatives and similar repositories for simhash-java
Users that are interested in simhash-java are comparing it to the libraries listed below
Sorting:
- 中文文档simhash值计算☆1,168Mar 13, 2026Updated last week
- Text retrieval database based on simhash similarity search☆25Mar 27, 2023Updated 2 years ago
- Java implementation for MinHash and LSH for finding near duplicate documents as measured by Jaccard similarity.☆32Mar 30, 2015Updated 10 years ago
- This provides tools for b-bit MinHash algorism.☆38Nov 21, 2025Updated 3 months ago
- Easy-to-use Java library for similarity checking of strings or numeric-series☆20Jan 23, 2020Updated 6 years ago
- Simple example of Java API☆20Aug 9, 2021Updated 4 years ago
- A Java implementation of Locality Sensitive Hashing (LSH)☆301Nov 19, 2022Updated 3 years ago
- Open Source Implementation of Simhash in Python☆24Sep 14, 2017Updated 8 years ago
- Elasticsearch plugin for b-bit minhash algorism☆63Jun 17, 2024Updated last year
- Spring boot, MVC, data and MongoDB CRUD web application☆18Mar 22, 2020Updated 5 years ago
- detection quora duplicate question☆19Apr 5, 2017Updated 8 years ago
- 新版代码生成器☆10Apr 19, 2018Updated 7 years ago
- 新的博客,原博客:www.blogfshare.com☆14Jul 3, 2020Updated 5 years ago
- 基于Scala Akka的分布式主题网络爬虫☆14Sep 2, 2019Updated 6 years ago
- A Java library implementing practical nearest neighbour search algorithm for multidimensional vectors that operates in sublinear time. It…☆202Jul 26, 2020Updated 5 years ago
- A Java implementation of doc2vec in ICML'14☆30Jul 23, 2015Updated 10 years ago
- word2vec实现的推荐系统☆11Jul 8, 2018Updated 7 years ago
- Based on spring-cache,integrate local cache [ehcache] and distributed cache [redis] to make secondary cache.☆11May 26, 2023Updated 2 years ago
- import wikidata to neo4j☆27Jan 24, 2016Updated 10 years ago
- 基于人工神经网络的中文语义相似度计算研究☆11Apr 1, 2013Updated 12 years ago
- similarity: Text similarity calculation Toolkit for Java. 文本相似度计算工具包,java编写,可用于文本相似度计算、情感分析等任务,开箱即用。☆1,571Jan 23, 2026Updated last month
- ansj分词.ict的真正java实现.分词效果速度都超过开源版的ict. 中文分词,人名识别,词性标注,用户自定义词典☆6,545Nov 19, 2023Updated 2 years ago
- A thumbnail generation Java library for Office,PDF,HTML,Text,MP3,MPEG and Image documents☆42Mar 6, 2026Updated 2 weeks ago
- simple simhashing in hadoop with cascading☆33May 9, 2011Updated 14 years ago
- 文档去重功能是为了解决搜索引擎的文档语义重复的问题,方法是多重哈希下的语义指纹算法。☆12Aug 17, 2013Updated 12 years ago
- Java implementation of Thompson sampling to solve the multi-armed bandit problem☆30Jun 14, 2023Updated 2 years ago
- 元数据驱动引擎之 数据服务;如果需要独立的服务化server, 请建立WebServer工程,并且配置RPC调用:MetaDataReadServerService,MetaDataWriteServerService,DataSourceService☆15Dec 16, 2022Updated 3 years ago
- ☆61Jul 19, 2024Updated last year
- Code for the paper Faster Phrase-Based Decoding by Refining Feature State☆14Jan 9, 2023Updated 3 years ago
- ☆13Sep 6, 2016Updated 9 years ago
- Flink performance tests☆28Nov 13, 2019Updated 6 years ago
- Clojure library for sitemap generation.☆16Feb 25, 2019Updated 7 years ago
- Simple Java library for transforming an Object to another Object☆12Aug 4, 2025Updated 7 months ago
- Jcseg is a light weight NLP framework developed with Java. Provide CJK and English segmentation based on MMSEG algorithm, With also keywo…☆921Sep 18, 2023Updated 2 years ago
- A Java version of ftrl algorithm☆24Apr 28, 2017Updated 8 years ago
- ChatGPT 速通手册☆15Jun 19, 2023Updated 2 years ago
- word2vec java版本的一个实现☆701Apr 1, 2021Updated 4 years ago
- 基于 Mahout 的新闻推荐系统☆74Nov 25, 2018Updated 7 years ago
- ☆10May 17, 2015Updated 10 years ago