gautamsm / data-science-on-mpp
A collection of examples illustrating data processing, data science, and machine learning on the Pivotal Greenplum and HAWQ MPP databases
☆20Updated 9 years ago
Alternatives and similar repositories for data-science-on-mpp:
Users that are interested in data-science-on-mpp are comparing it to the libraries listed below
- This project contains the code to translate between Apache Spark and SFrame.☆20Updated 8 years ago
- A simple example of containerized data science with python and Docker.☆51Updated 7 years ago
- feng - feature engineering for machine-learning champions☆27Updated 8 years ago
- Spark library for doing exploratory data analysis in a scalable way☆43Updated 9 years ago
- A simple introduction to using spark ml pipelines☆26Updated 7 years ago
- Repo for experiments on pyspark and sklearn☆79Updated 11 years ago
- Spark Parameter Optimization and Tuning☆31Updated 7 years ago
- ☆41Updated 7 years ago
- Tutorial for Deploying Anaconda Cluster and PySpark on top of Red Hat Storage GlusterFS☆8Updated 10 years ago
- A short guide for transitioning from Python to Scala☆65Updated 9 years ago
- A place for all things Pivotal & R☆25Updated 3 years ago
- Deprecated, please use https://github.com/jcrist/skein or https://github.com/dask/dask-yarn instead☆52Updated 6 years ago
- PyMC version 3 (PyMC 2 is in branch 2.3)☆27Updated 10 years ago
- code and slides for my PyGotham 2016 talk, "Higher-level Natural Language Processing with textacy"☆15Updated 8 years ago
- Machine Learning Open Source Software☆23Updated 6 years ago
- Invoke Pandas plotting by piping in SQL output via PSQL (Can be used with Postgres or Greenplum or any SQL engine).☆16Updated 10 years ago
- personal cheatsheets on various technologies☆25Updated 8 years ago
- A cookiecutter template for Apache Spark applications written in Scala☆10Updated 6 years ago
- Machine Learning Pipeline Stages for Spark (exposed in Scala/Java + Python)☆74Updated last year
- Collection of tutorials on text analytics/NLP, including vector space models, neural language models and topic models on the Pivotal MPP …☆17Updated 9 years ago
- Training materials for Strata, AMP Camp, etc☆149Updated 9 years ago
- Docker container with a PyData stack and JupyterHub server☆37Updated 8 years ago
- Exploration Library in Java☆12Updated last year
- Machine Learning with Scikit-Learn (material for pydata Amsterdam 2016)☆30Updated 9 years ago
- Examples for Fast Data Processing with Spark☆59Updated 11 years ago
- A Topic Modeling toolbox☆92Updated 9 years ago
- An API for Distributed Machine Learning☆154Updated 8 years ago
- A library that allows serialization of SciKit-Learn estimators into PMML☆70Updated 5 years ago
- ☆16Updated 11 years ago
- Fast, easy and intuitive machine learning prototyping.☆124Updated 10 years ago