JensRantil / disco-slct
A mapreduce implementation of SLCT (http://ristov.users.sourceforge.net/slct/) using Disco.
☆16Updated 13 years ago
Related projects: ⓘ
- Security log file challenge☆28Updated 8 years ago
- An Exploration into Graph Databases☆28Updated 8 years ago
- Simple clustering library for python.☆65Updated 3 years ago
- Probabilistic Data Structures in Python (originally presented at PyData 2013)☆55Updated 2 years ago
- POC IDS anomaly detection engine built with iPython notebook, matplotlib, pandas, numpy, scikit-learn, d3.js, hyperloglog implementation,…☆78Updated 10 years ago
- Estimating how similar are two sets using MinHash (Jaccard similarity coefficient)☆29Updated 11 years ago
- A RESTful web service that runs microtasks across multiple crowds, provides quality control techniques, and is easily extensible.☆51Updated 7 years ago
- An extension of the kafka-python package that adds features like multiprocess consumers.☆38Updated last year
- Analyze the structure and dynamics of an open source project's developer community, using graph algorithms, etc.☆57Updated 3 years ago
- A Python HTTP client to the Prelert Anomaly Detective Engine REST API - ARCHIVED☆32Updated 8 years ago
- A project that implements statistical methods for identifying anomalous files☆22Updated 9 years ago
- IPLoM (Iterative Partitioning Log Mining) - Java☆14Updated 8 years ago
- Tools for writing, submitting, debugging, and monitoring Storm topologies in pure Python☆247Updated last year
- PySpark for Elastic Search☆55Updated 7 years ago
- Set of Hadoop, Spark and Storm based tools for web and customer analytic☆34Updated 3 years ago
- ☆146Updated 8 years ago
- Deploy Dask on Marathon☆10Updated 7 years ago
- Python language Plugin for elasticsearch☆103Updated 5 years ago
- Code reference from my Qbox blog posts.☆87Updated 9 years ago
- Lossy Counting and Sticky Sampling implementation for efficient frequency counts on data streams.☆62Updated 8 years ago
- ☆49Updated this week
- Tail a log file and send log lines automatically to a kafka topic☆58Updated 12 years ago
- Sequential anomaly detection method evaluation☆18Updated 11 years ago
- SociaLite: query language for large-scale graph analysis and data mining☆109Updated 8 years ago
- unofficial git mirror of http://svn.whoosh.ca svn repo☆49Updated 14 years ago
- Vowpal Wabbit Webservice. A web service that accepts VW formatted text and runs it through a VW daemon instance.☆40Updated 8 years ago
- My capstone project for Galvanize (Zipfian Academy)☆38Updated 5 years ago
- Experimental parallel data analysis toolkit.☆118Updated 2 years ago
- Common improvements for your Python projects☆26Updated 8 years ago
- ☆18Updated this week