mnielsen / Pregel
Toy single-machine implementation of the Pregel graph-based framework
☆116Updated 8 years ago
Alternatives and similar repositories for Pregel:
Users that are interested in Pregel are comparing it to the libraries listed below
- Python wrapper for the Vowpal Wabbit machine learning library.☆53Updated 11 years ago
- A RESTful web service that runs microtasks across multiple crowds, provides quality control techniques, and is easily extensible.☆51Updated 7 years ago
- SociaLite: query language for large-scale graph analysis and data mining☆109Updated 8 years ago
- ☆111Updated 8 years ago
- ☆92Updated 9 years ago
- Unified interface for local and distributed ndarrays☆157Updated 6 years ago
- Code to allow running BIDMach on Spark including HDFS integration and lightweight sparse model updates (Kylix).☆15Updated 4 years ago
- Latent Dirichlet Allocation for topic modeling of streamed data sources☆100Updated 10 years ago
- Splash Project for parallel stochastic learning☆94Updated 7 years ago
- SDK for Turi's GraphLab Create.☆149Updated 7 years ago
- Explorations relative to cloning FlumeJava☆93Updated 4 years ago
- Distributed Numpy☆148Updated 7 years ago
- Scalable Machine Learning in Scalding☆360Updated 7 years ago
- Code for "Performance shootout between nearest-neighbour libraries": http://radimrehurek.com/2013/11/performance-shootout-of-nearest-neig…☆99Updated 9 years ago
- Yahoo!'s topic modelling framework using Latent Dirichlet Allocation☆337Updated 13 years ago
- xlvector's solution of github contest☆33Updated 15 years ago
- Python Approximate Nearest Neighbor Search in very high dimensional spaces with optimised indexing.☆214Updated 3 years ago
- Apache Pig utilities to build training corpora for machine learning / NLP out of public Wikipedia and DBpedia dumps.☆158Updated 2 years ago
- Implementation of Tyler Neylon's Locality-Specific Hash based on simplex tesselations☆28Updated 13 years ago
- A prototype of Hive UDFs/UDTFs that execute nested SQL queries within rows.☆54Updated 9 years ago
- Cloud9 is a Hadoop toolkit for working with big data☆237Updated 9 years ago
- Python MapReduce library written in Cython. Visit us in #hadoopy on freenode. See the link below for documentation and tutorials.☆243Updated 9 years ago
- GraphChi's Java version☆237Updated last year
- Example code for "Web-Scale Computer Vision using MapReduce for Multimedia Data Mining"☆49Updated 14 years ago
- Distributed Matrix Library☆71Updated 8 years ago
- Lossy Counting and Sticky Sampling implementation for efficient frequency counts on data streams.☆63Updated 8 years ago
- A grouping of Apache Pig examples.☆65Updated 4 years ago
- A Python wrapper for Cascading☆222Updated 5 years ago
- Experimental parallel data analysis toolkit.☆121Updated 3 years ago
- Quickly start YARN cluster on EC2☆30Updated 7 years ago