Java implementation for MinHash and LSH for finding near duplicate documents as measured by Jaccard similarity.
☆33Mar 30, 2015Updated 11 years ago
Alternatives and similar repositories for MinHashLSH
Users that are interested in MinHashLSH are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Easy-to-use Java library for similarity checking of strings or numeric-series☆20Jan 23, 2020Updated 6 years ago
- Natural Language Processing algorithm including TextClassification, sentiment analysis, TextRank, LDA and so on☆12Mar 23, 2017Updated 9 years ago
- ☆12Sep 14, 2021Updated 4 years ago
- ☆12Jun 17, 2019Updated 6 years ago
- ☆11May 16, 2022Updated 3 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- a list of links to help you make various important architectural decisions☆11Jul 13, 2016Updated 9 years ago
- A React component to implement continuous scrolling (for modern browser).☆17Jan 12, 2017Updated 9 years ago
- The official implementation of EMNLP 2021 paper "#HowYouTagTweets: Learning User Hashtagging Preferences via Personalized Topic Attention…☆11Feb 21, 2023Updated 3 years ago
- 计算TFIDF的三种方法:Python、sklearn、gensim☆11Feb 26, 2019Updated 7 years ago
- ☆22Aug 27, 2016Updated 9 years ago
- Java implementation of famous fuzzy wuzzy algorithm -- http://seatgeek.com/blog/dev/fuzzywuzzy-fuzzy-string-matching-in-python☆15Jul 13, 2016Updated 9 years ago
- Article from Medium about Push Notifications in Android☆14Dec 27, 2015Updated 10 years ago
- Detecting near duplicates usign Moses Charikars Algorithm☆20Oct 7, 2014Updated 11 years ago
- 自然语言处理之CFG句法分析☆10Mar 27, 2018Updated 8 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ALBERT Text Classification Tensorflow, Resume Classification☆15Mar 28, 2020Updated 6 years ago
- simple arbitrage☆13Jul 29, 2010Updated 15 years ago
- using FM latent vectors as embedding features☆14Sep 7, 2017Updated 8 years ago
- A Locality-Sensitive Hashing Library for Scala with optional Redis storage.☆16Jan 5, 2022Updated 4 years ago
- Migrate repositories from GitLab to GitHub☆22Jan 8, 2019Updated 7 years ago
- 一行代码使用BERT生成句向量,BERT做文本分类、文本相似度计算☆10Jul 1, 2019Updated 6 years ago
- 使用ALBERT预训练模型,用于识别文本中的时间,同时验证模型的预测耗时是否有显著提升。☆57Dec 16, 2019Updated 6 years ago
- This repository provides a starter code for using tensorboard via tensorflow for visualising embeddings☆14Apr 4, 2018Updated 8 years ago
- ☆15Nov 17, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆11Updated this week
- AI Power Documentation☆29Mar 19, 2026Updated last month
- 机器学习相关作业☆25Jan 11, 2023Updated 3 years ago
- Tutorial on parsing Enron email to Avro and then explore the email set using Spark.☆52Mar 25, 2026Updated 3 weeks ago
- Examples of spark-lucenerdd☆15Oct 6, 2023Updated 2 years ago
- An original implementation of the paper "CREPE: Open-Domain Question Answering with False Presuppositions"☆16Nov 5, 2024Updated last year
- Java library for building clients for XHTML-based hypermedia APIs☆35Nov 7, 2011Updated 14 years ago
- I'm 99% sure that you already heard about APIs or REST APIs, it's what Twitter, flickr and a lot more companies use to share they're reso…☆29Apr 9, 2011Updated 15 years ago
- Python实现的Scheme方言,支持宏、continuation、lambda、各种基本类型等等,可以直接Python解释执行,也可以编译到JavaScript。编译到JS可以与JavaScript动态交互(互相调用)☆21Jun 24, 2013Updated 12 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- pairwise learning to rank with logistic regression☆19Apr 24, 2016Updated 9 years ago
- ☆14May 25, 2022Updated 3 years ago
- ☆10Apr 16, 2022Updated 4 years ago
- DSResSol: A sequence-based solubility predictor created with Dilated Squeeze Excitation Residual Networks☆12May 30, 2024Updated last year
- my own R course☆11Oct 14, 2014Updated 11 years ago
- PVDM, PVDBOW, doc2vec, sentence2vec, "Distributed Representations of Sentences and Documents ICML'14".☆21May 9, 2018Updated 7 years ago
- code for the paper "Personalized Context-Aware Re-ranking for E-commerce Recommendation Systems"☆51Jan 23, 2019Updated 7 years ago