JensRantil / disco-slctLinks
A mapreduce implementation of SLCT (http://ristov.users.sourceforge.net/slct/) using Disco.
☆16Updated 14 years ago
Alternatives and similar repositories for disco-slct
Users that are interested in disco-slct are comparing it to the libraries listed below
Sorting:
- Security log file challenge☆28Updated 9 years ago
- A RESTful web service that runs microtasks across multiple crowds, provides quality control techniques, and is easily extensible.☆52Updated 8 years ago
- POC IDS anomaly detection engine built with iPython notebook, matplotlib, pandas, numpy, scikit-learn, d3.js, hyperloglog implementation,…☆79Updated 11 years ago
- Tools for writing, submitting, debugging, and monitoring Storm topologies in pure Python☆246Updated 2 years ago
- SAMOA (Scalable Advanced Massive Online Analysis) is an open-source platform for mining big data streams.☆427Updated 9 years ago
- Naarad is a highly configurable system analysis tool that parses and plots timeseries data for better visual correlation. Naarad was buil…☆238Updated 8 years ago
- Probabilistic Data Structures in Python (originally presented at PyData 2013)☆55Updated 3 years ago
- The metric correlation component of Etsy's Kale system☆709Updated 8 years ago
- Simple clustering library for python.☆66Updated 4 years ago
- Secondary indexing for structured and unstructured data in Big Table style databases.☆44Updated 5 years ago
- X-Trace is a tool that provides fine-grained visibility into large, complex distributed systems. It can be used by application developers…☆76Updated last year
- Battle-tested Apache Storm Multi-Lang implementation for Python☆70Updated 4 months ago
- Code reference from my Qbox blog posts.☆87Updated 10 years ago
- Toy single-machine implementation of the Pregel graph-based framework☆118Updated 8 years ago
- Experimental parallel data analysis toolkit.☆122Updated 4 years ago
- My capstone project for Galvanize (Zipfian Academy)☆38Updated 7 years ago
- SociaLite: query language for large-scale graph analysis and data mining☆110Updated 9 years ago
- Pyleus is a Python framework for developing and launching Storm topologies.☆400Updated 6 years ago
- Code and Presentation slides for Teaching the Elephant to Read☆17Updated 9 years ago
- A pure python implementation of locality sensitive hashing for text documents☆87Updated 10 years ago
- Large-scale ML & graph analytics on Giraph☆79Updated 9 years ago
- Analyze the structure and dynamics of an open source project's developer community, using graph algorithms, etc.☆58Updated 4 years ago
- Estimating how similar are two sets using MinHash (Jaccard similarity coefficient)☆30Updated 12 years ago
- Locality-sensitive hashing algorithm for text similarity comparisons☆59Updated 8 months ago
- ☆24Updated 7 years ago
- Code for "Performance shootout between nearest-neighbour libraries": http://radimrehurek.com/2013/11/performance-shootout-of-nearest-neig…☆98Updated 10 years ago
- PredictionIO Python SDK☆196Updated 7 years ago
- Lossy Counting and Sticky Sampling implementation for efficient frequency counts on data streams.☆63Updated 9 years ago
- GraphChi's Java version☆238Updated 2 years ago
- Pylearn2 in practice☆41Updated 10 years ago