guozheng / hadoop-completion
hadoop shell commands auto-complete script for Bash Completion
☆12Updated 9 years ago
Related projects: ⓘ
- Utilities and examples to asssist in working with PySpark and Cassandra.☆36Updated 9 years ago
- ☆23Updated 7 years ago
- [DEPRECATED] For read-only reference of the ALOJA Big Data Benchmarking platform: includes tools to define and deploy clusters, orchestr…☆23Updated 3 years ago
- Data Science Command Line Toolbox in a docker container☆28Updated 6 years ago
- personal cheatsheets on various technologies☆25Updated 8 years ago
- A project that implements statistical methods for identifying anomalous files☆22Updated 9 years ago
- Dockerfiles for building docker images☆27Updated last month
- Tutorial for Deploying Anaconda Cluster and PySpark on top of Red Hat Storage GlusterFS☆8Updated 9 years ago
- An Exploration into Graph Databases☆28Updated 8 years ago
- Tail a log file and send log lines automatically to a kafka topic☆58Updated 12 years ago
- Provides a Pythonic interface for reading and writing Avro schemas☆26Updated 2 years ago
- Augustus is an open source system for building and scoring statistical models designed to work with data sets that are too large to fit i…☆43Updated 10 years ago
- Apache Toree quickstart tutorial☆29Updated 8 years ago
- An analysis of adverse drug event data using Hadoop, R, and Gephi☆44Updated 8 years ago
- Apache Pig plugin for Eclipse☆12Updated 7 years ago
- Security log file challenge☆28Updated 8 years ago
- A Python HTTP client to the Prelert Anomaly Detective Engine REST API - ARCHIVED☆32Updated 8 years ago
- Examples of using SparklingPandas and Pandas with PySpark☆15Updated 9 years ago
- A javascript shell for elasticsearch☆106Updated 9 years ago
- My data is bigger than your data!☆39Updated 5 years ago
- An ElasticSearch / Graphite shim which translates graphite requests into ElasticSearch data queries for a given mapping☆16Updated 5 years ago
- Data Science box: Spark, Jupyter, R+RStudio, Zeppelin, Python 2 & 3, Java, Scala.☆39Updated 6 years ago
- ☆17Updated this week
- Real time and offline time series analysis with Spark, Spark Streaming and Storm☆21Updated 3 years ago
- Docker container to locally run Spark and Kafka☆15Updated 8 years ago
- A real time streaming implementation of markov chain based fraud detection☆24Updated 9 years ago
- Scripts for making Hadoop deployments in AWS easy☆11Updated 10 years ago
- An example project for doing grid search in MLlib☆13Updated 9 years ago
- ☆17Updated 9 years ago
- Deprecated, please use https://github.com/jcrist/skein or https://github.com/dask/dask-yarn instead☆53Updated 6 years ago