src-d / sparkpickleLinks
Pure Python implementation of reading SequenceFile-s with pickles written by Spark's saveAsPickleFile()
☆24Updated 8 years ago
Alternatives and similar repositories for sparkpickle
Users that are interested in sparkpickle are comparing it to the libraries listed below
Sorting:
- Utilities to work with Scala/Java code with py4j☆40Updated 2 years ago
- Machine Learning Pipeline Stages for Spark (exposed in Scala/Java + Python)☆74Updated 2 years ago
- Spark library for doing exploratory data analysis in a scalable way☆43Updated 10 years ago
- Asynchronous actions for PySpark☆48Updated 4 years ago
- ☆32Updated 5 years ago
- A tool and library for easily deploying applications on Apache YARN☆146Updated last year
- A curated inventory of machine learning methods available on the Apache Spark platform, both in official and third party libraries.☆66Updated 8 years ago
- Building Annoy Index on Apache Spark☆72Updated 5 years ago
- functionstest☆33Updated 9 years ago
- An Apache Spark-shell backend for IPython☆105Updated 4 years ago
- Locality Sensitive Hashing for Apache Spark☆87Updated 4 years ago
- ☆161Updated 4 years ago
- A library on top of either pex or conda-pack to make your Python code easily available on a cluster☆46Updated this week
- Yggdrasil: Faster Decision Trees Using Column Partitioning in Spark☆30Updated 7 years ago
- Deprecated, please use https://github.com/jcrist/skein or https://github.com/dask/dask-yarn instead☆53Updated 7 years ago
- Distributed Streaming Quantiles (for PySpark)☆38Updated 12 years ago
- Jupyter kernel for scala and spark☆190Updated 2 years ago
- Jupyter Notebook extension for Apache Spark integration☆191Updated 5 years ago
- control spark-shell from vim☆11Updated 9 years ago
- Unified interface for local and distributed ndarrays☆157Updated 7 years ago
- Python client for Spark Jobserver Rest API☆40Updated 5 years ago
- ☆40Updated 9 years ago
- ☕⛵WIP PySpark dependency management☆22Updated 7 years ago
- C++ native client for Impala and Hive, with Python / pandas bindings☆72Updated 7 years ago
- Apache (Py)Spark type annotations (stub files).☆118Updated 3 years ago
- Topic Modeling on Apache Spark☆94Updated 6 years ago
- Randomized SVD of large sparse matrices on Spark☆77Updated 3 years ago
- A scala-based feature generation and modeling framework☆60Updated 7 years ago
- Natural Language Processing with Spark's MLlib☆63Updated 8 years ago
- Visualize streaming machine learning in Spark☆177Updated 8 years ago