JensRantil / disco-slctLinks
A mapreduce implementation of SLCT (http://ristov.users.sourceforge.net/slct/) using Disco.
☆16Updated 13 years ago
Alternatives and similar repositories for disco-slct
Users that are interested in disco-slct are comparing it to the libraries listed below
Sorting:
- Security log file challenge☆28Updated 9 years ago
- POC IDS anomaly detection engine built with iPython notebook, matplotlib, pandas, numpy, scikit-learn, d3.js, hyperloglog implementation,…☆79Updated 10 years ago
- Simple clustering library for python.☆65Updated 4 years ago
- Code for generating analyses found in "Analyzing Log Analysis: An Empirical Study of User Log Mining" to appear in LISA 2014.☆8Updated 10 years ago
- Tail a log file and send log lines automatically to a kafka topic☆57Updated 13 years ago
- A RESTful web service that runs microtasks across multiple crowds, provides quality control techniques, and is easily extensible.☆51Updated 7 years ago
- Battle-tested Apache Storm Multi-Lang implementation for Python☆70Updated 3 years ago
- Python MapReduce library written in Cython. Visit us in #hadoopy on freenode. See the link below for documentation and tutorials.☆243Updated 9 years ago
- Python language Plugin for elasticsearch☆103Updated 6 years ago
- Social Graph Analysis using Elastic MapReduce and PyPy☆55Updated 14 years ago
- Python collections supporting parallel map/reduce style methods☆40Updated last year
- Tools for writing, submitting, debugging, and monitoring Storm topologies in pure Python☆246Updated 2 years ago
- Python API for the Kafka Message Queue☆56Updated 12 years ago
- A simple and fast search engine☆70Updated 3 years ago
- Get Data Reused☆20Updated 8 years ago
- ☆24Updated 7 years ago
- Lossy Counting and Sticky Sampling implementation for efficient frequency counts on data streams.☆63Updated 9 years ago
- A javascript shell for elasticsearch☆105Updated 10 years ago
- python elasticsearch client☆362Updated 3 years ago
- An Exploration into Graph Databases☆28Updated 9 years ago
- scalding powered machine learning☆109Updated 10 years ago
- SAMOA (Scalable Advanced Massive Online Analysis) is an open-source platform for mining big data streams.☆425Updated 9 years ago
- A high-performance distributed web crawling & scraping framework written with golang and python.☆30Updated 9 years ago
- unofficial git mirror of http://svn.whoosh.ca svn repo☆49Updated 15 years ago
- Deploy Dask on Marathon☆10Updated 8 years ago
- Analyze the structure and dynamics of an open source project's developer community, using graph algorithms, etc.☆58Updated 4 years ago
- Python Client for WebHDFS REST API☆43Updated 10 years ago
- X-Trace is a tool that provides fine-grained visibility into large, complex distributed systems. It can be used by application developers…☆74Updated 11 months ago
- templatemaker is a Python library that can extract data from files with a similar format, like HTML pages.☆63Updated 4 years ago
- A Python HTTP client to the Prelert Anomaly Detective Engine REST API - ARCHIVED☆32Updated 9 years ago