usc-isi-i2 / dig-lsh-clusteringLinks
Clustering documents based on LSH
☆14Updated 9 years ago
Alternatives and similar repositories for dig-lsh-clustering
Users that are interested in dig-lsh-clustering are comparing it to the libraries listed below
Sorting:
- A startup search engine made using embeddings built on crunchbase company descriptions☆11Updated 9 years ago
- Code for "Performance shootout between nearest-neighbour libraries": http://radimrehurek.com/2013/11/performance-shootout-of-nearest-neig…☆99Updated 10 years ago
- ☆26Updated 8 years ago
- Flow-based data pre-processing for deep learning☆31Updated 4 years ago
- A pure python implementation of locality sensitive hashing for text documents☆85Updated 9 years ago
- An Apache Lucene TokenFilter that uses a word2vec vectors for term expansion.☆24Updated 11 years ago
- Notes on Lambda Architecture☆12Updated 7 years ago
- Movielens collaborative filtering with Solr streaming expression☆11Updated 8 years ago
- A project to demonstrate maximum entropy models for extracting quotes from news articles in Python.☆49Updated 13 years ago
- Using Word2Vec on lists and sets☆34Updated 3 months ago
- Similarity search on Wikipedia using gensim in Python.☆60Updated 6 years ago
- NLP tutorial for the Berlin Data Science Retreat☆41Updated 9 years ago
- Discovers similarity between scientific papers☆62Updated 9 years ago
- KDD Hands-On Tutorial (2018)☆29Updated 2 years ago
- an implemetation of LDA in Python, from Heinrich's paper : http://www.arbylon.net/publications/text-est.pdf☆43Updated 15 years ago
- Hadoop jobs for WikiReverse project. Parses Common Crawl data for links to Wikipedia articles.☆38Updated 7 years ago
- Entity level sentiment analysis for product reviews using deep learning☆56Updated 9 years ago
- DeepTeach - the Interactive Deep Image Classifier Builder☆47Updated 8 years ago
- A collection of documents and materials for the EMNLP-2015 Semantic Similarity tutorial☆30Updated 10 years ago
- Earth Mover's Distance based Similarity Join on Hadoop☆12Updated 9 years ago
- Nonparametric timeseries classification for Twitter trending topic detection (MEng thesis)☆119Updated 12 years ago
- ☆96Updated 7 years ago
- Implementation of an algorithm computing the nearest "N" neighbours to a vector, using a collection of hyperplane hashers.☆30Updated 10 years ago
- Build tables of information by extracting facts from indexed text corpora via a simple and effective query language.☆56Updated 6 years ago
- Fast and robust NLP components implemented in Java.☆52Updated 4 years ago
- Semantic embeddings of entities☆66Updated 8 years ago
- Auto Encoder on Tensorflow☆12Updated 7 years ago
- Datasets and notebooks☆13Updated 8 years ago
- Instructions & code for the EuroPython 2014 training session "Topic Modeling for Fun and Profit"☆110Updated 11 years ago
- Implementation of Bayesian Sets for fast similarity searches.☆14Updated 13 years ago