src-d / sparkpickle
Pure Python implementation of reading SequenceFile-s with pickles written by Spark's saveAsPickleFile()
☆24Updated 7 years ago
Alternatives and similar repositories for sparkpickle:
Users that are interested in sparkpickle are comparing it to the libraries listed below
- Yggdrasil: Faster Decision Trees Using Column Partitioning in Spark☆31Updated 6 years ago
- Distributed Streaming Quantiles (for PySpark)☆37Updated 11 years ago
- Python client for Spark Jobserver Rest API☆39Updated 4 years ago
- A library that allows serialization of SciKit-Learn estimators into PMML☆70Updated 5 years ago
- Ranking algorithms for Spark machine learning pipeline☆15Updated 7 years ago
- Scala wrapper for Annoy☆58Updated 2 years ago
- Spark library for doing exploratory data analysis in a scalable way☆43Updated 9 years ago
- Deprecated, please use https://github.com/jcrist/skein or https://github.com/dask/dask-yarn instead☆52Updated 6 years ago
- Spark Parameter Optimization and Tuning☆31Updated 6 years ago
- Seldon Spark Jobs☆26Updated 9 years ago
- Utilities to work with Scala/Java code with py4j☆40Updated last year
- FluRS: A Python library for streaming recommendation algorithms☆109Updated 2 years ago
- Locality Sensitive Hashing for Apache Spark☆88Updated 2 years ago
- Utilities and examples to asssist in working with PySpark and Cassandra.☆36Updated 9 years ago
- functionstest☆33Updated 8 years ago
- A simple demonstration of sub-sequence sampling as used for anomaly detection with EKG signals☆102Updated 4 years ago
- Repo for experiments on pyspark and sklearn☆79Updated 10 years ago
- This toolkit provides an implementation of Modified Adsorption (MAD), a graph-based semi-supervised learning (SSL) algorithm.☆23Updated 7 years ago
- Luigi Plugin for Hubot☆35Updated 8 years ago
- A tool and library for easily deploying applications on Apache YARN☆142Updated 10 months ago
- A tool for running Spark on Google Compute Engine☆16Updated 8 years ago
- An Apache Spark-shell backend for IPython☆105Updated 3 years ago
- Cython based wrapper for libavro☆25Updated 4 years ago
- Building Annoy Index on Apache Spark☆72Updated 4 years ago
- Factorization Machines on Spark and Glint☆25Updated 8 years ago
- Demo code contrasting Google Dataflow (Apache Beam) with Apache Spark☆14Updated 8 years ago
- Unified interface for local and distributed ndarrays☆158Updated 6 years ago
- ☆31Updated 4 years ago
- Simplified tree-based classifier and regressor for interpretable machine learning (scikit-learn compatible)☆47Updated 3 years ago
- Wabbit Wappa is a full-featured Python wrapper for the Vowpal Wabbit machine learning utility.☆101Updated 7 years ago