LanceNorskog/LSH-Hadoop

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/LanceNorskog/LSH-Hadoop)

LanceNorskog / LSH-Hadoop

Implementation of Tyler Neylon's Locality-Specific Hash based on simplex tesselations

☆28

Alternatives and similar repositories for LSH-Hadoop

Users that are interested in LSH-Hadoop are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

pierre / hfind
View on GitHub
Find implementation for Hadoop
☆17Sep 9, 2015Updated 10 years ago
takahi-i / likelike
View on GitHub
An implementation of locality sensitive hashing with Hadoop
☆58Feb 5, 2015Updated 11 years ago
sonalgoyal / hiho
View on GitHub
Hadoop Data Integration with various databases, ftp servers, salesforce. Incremental update, dedup, append, merge your data on Hadoop.
☆92Apr 11, 2013Updated 13 years ago
jpatanooga / Caduceus
View on GitHub
Set of example algorithm implementations focused on statistics and machine learning
☆31Apr 11, 2011Updated 15 years ago
reines / persistenthashmap
View on GitHub
A disk-based HashMap implementation allowing persistence of data across sessions.
☆15May 7, 2014Updated 12 years ago
GPUs on demand by Runpod - Special Offer Available • Ad
Run AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
jrecursive / protograph
View on GitHub
an experimental graph server
☆21Jun 25, 2011Updated 15 years ago
matthewmccullough / hadoop-intro
View on GitHub
Hadoop Inroduction Presentation Demos
☆22Jul 27, 2013Updated 12 years ago
lintool / Ivory
View on GitHub
A Hadoop toolkit for web-scale information retrieval research
☆87Dec 12, 2014Updated 11 years ago
wihl / Timberwolf
View on GitHub
Hadoop HBase ingestion of Microsoft Exchange
☆15Apr 6, 2012Updated 14 years ago
tdunning / pig-vector
View on GitHub
Mahout vector encoding for pig
☆53Nov 20, 2022Updated 3 years ago
toddlipcon / mlockall_agent
View on GitHub
JVMTI agent which calls mlockall and setuids down to a target user upon initialization
☆21Sep 13, 2011Updated 14 years ago
nahi / siphash-java-inline
View on GitHub
SipHash implementation with hand inlining the SIPROUND
☆15Jun 8, 2014Updated 12 years ago
rjurney / Cloud-Stenography
View on GitHub
Main Repo
☆15Jun 24, 2010Updated 16 years ago
rfoldes / Avro-Test
View on GitHub
A simple test of Avro 1.5 capabilities including dynamic typing, untagged (compact) data storage and schema evolution.
☆36May 5, 2011Updated 15 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
avibryant / simmer
View on GitHub
Reduce your data. A unix filter for algebird-powered aggregation.
☆141Apr 17, 2017Updated 9 years ago
srijiths / jtopia
View on GitHub
Java clone for python term extractor topia.termextract
☆34Aug 22, 2014Updated 11 years ago
jamartinh / Orange3-Spark
View on GitHub
A set of widgets for Python's Orange Machine Learning to work with Apache Spark ML
☆15Dec 24, 2016Updated 9 years ago
bwhite / hadoop_vision
View on GitHub
Example code for "Web-Scale Computer Vision using MapReduce for Multimedia Data Mining"
☆48Aug 2, 2010Updated 15 years ago
ogrisel / pignlproc
View on GitHub
Apache Pig utilities to build training corpora for machine learning / NLP out of public Wikipedia and DBpedia dumps.
☆163Nov 8, 2022Updated 3 years ago
lmucs / grapevine
View on GitHub
Want to find upcoming events? Hear about them through the Grapevine.
☆12Nov 21, 2018Updated 7 years ago
mahadevkonar / ambari-yarn-utils
View on GitHub
Ambari YARN UTILS
☆30Mar 30, 2023Updated 3 years ago
tweetmagik / spark-yarn
View on GitHub
Launch Spark clusters on YARN
☆24Aug 29, 2011Updated 14 years ago
jghoman / haivvreo
View on GitHub
Hive + Avro. Serde for working with Avro in Hive
☆60Dec 16, 2023Updated 2 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
jsalvatier / multichain_mcmc
View on GitHub
Multichain MCMC framework and algorithms based on PyMC.
☆17Feb 14, 2011Updated 15 years ago
neubig / kylm
View on GitHub
The Kyoyo Language Modeling Toolkit
☆27Nov 27, 2014Updated 11 years ago
wpm / Naive-Bayes-Gibbs-Sampler
View on GitHub
Gibbs sampler for for a Naive Bayes document classifier
☆24Dec 15, 2012Updated 13 years ago
pierre / sweeper
View on GitHub
Hadoop utility to quickly find large directories to clean up or small files to combine.
☆15Jan 12, 2012Updated 14 years ago
xinyandai / similarity-search
View on GitHub
A framework for index based similarity search.
☆20May 10, 2019Updated 7 years ago
sudar / Yahoo_LDA
View on GitHub
Yahoo!'s topic modelling framework using Latent Dirichlet Allocation
☆337Sep 21, 2011Updated 14 years ago
emsixteeen / IterativeReduce
View on GitHub
Iterative Reduce
☆22Jun 3, 2014Updated 12 years ago
castagna / hbase-rdf
View on GitHub
☆24Oct 13, 2020Updated 5 years ago
hammerlab / magic-rdds
View on GitHub
Miscellaneous functionality for manipulating Apache Spark RDDs.
☆22Dec 29, 2018Updated 7 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
alexeygrigorev / mastering-java-data-science
View on GitHub
The code for the book "Mastering Java for Data Science"
☆18Apr 6, 2019Updated 7 years ago
globusonline / agamemnon
View on GitHub
A graph database for python built on top of cassandra
☆48Sep 29, 2014Updated 11 years ago
kevinweil / FileSetInputFormat
View on GitHub
A Hadoop input format for sending lists of files as keys to a mapper. Set the list of files, and an input split will be created per file…
☆16Apr 7, 2010Updated 16 years ago
thomasjungblut / thomasjungblut-common
View on GitHub
This is my main Java library for all kinds of datastructures, algorithms and everything else that I need.
☆75Jun 14, 2023Updated 3 years ago
mrsqueeze / spark-hash
View on GitHub
Locality Sensitive Hashing for Apache Spark
☆198Nov 1, 2016Updated 9 years ago
LinkedInAttic / datafu
View on GitHub
Hadoop library for large-scale data processing, now an Apache Incubator project
☆581Jul 8, 2014Updated 12 years ago
yods / storm-ml-play
View on GitHub
Experiments with VowPal Wabbit Machine Learning & Storm
☆26Apr 29, 2013Updated 13 years ago