gautamsm / data-science-on-mpp
A collection of examples illustrating data processing, data science, and machine learning on the Pivotal Greenplum and HAWQ MPP databases
☆20Updated 8 years ago
Alternatives and similar repositories for data-science-on-mpp:
Users that are interested in data-science-on-mpp are comparing it to the libraries listed below
- This project contains the code to translate between Apache Spark and SFrame.☆21Updated 8 years ago
- Tutorial for Deploying Anaconda Cluster and PySpark on top of Red Hat Storage GlusterFS☆8Updated 9 years ago
- A simple example of containerized data science with python and Docker.☆51Updated 6 years ago
- Deprecated, please use https://github.com/jcrist/skein or https://github.com/dask/dask-yarn instead☆52Updated 6 years ago
- REST web service for scoring PMML models☆50Updated 11 years ago
- ☆41Updated 7 years ago
- feng - feature engineering for machine-learning champions☆27Updated 7 years ago
- Proposals for new Jupyter subprojects to enter into incubation☆18Updated 4 years ago
- A library that allows serialization of SciKit-Learn estimators into PMML☆70Updated 5 years ago
- Tools for performing hyperparameter search with Scikit-Learn and Dask http://dask-searchcv.readthedocs.io☆11Updated 7 years ago
- Spark Parameter Optimization and Tuning☆31Updated 6 years ago
- The slides, code examples and resources for the PyCon 2015 Ireland talk on building data pipelines☆13Updated 9 years ago
- Demo code contrasting Google Dataflow (Apache Beam) with Apache Spark☆14Updated 8 years ago
- Spark library for doing exploratory data analysis in a scalable way☆43Updated 9 years ago
- Training materials for Strata, AMP Camp, etc☆150Updated 9 years ago
- Machine learning evaluation database☆24Updated 6 years ago
- Collection of tutorials on text analytics/NLP, including vector space models, neural language models and topic models on the Pivotal MPP …☆17Updated 8 years ago
- Predicting sales with Pandas☆15Updated 9 years ago
- Machine Learning with Scikit-Learn (material for pydata Amsterdam 2016)☆30Updated 8 years ago
- Repo for experiments on pyspark and sklearn☆79Updated 10 years ago
- A simple introduction to using spark ml pipelines☆26Updated 6 years ago
- Docker container with a PyData stack and JupyterHub server☆37Updated 8 years ago
- A couple projects using scikit-learn illustrating project decision making.☆15Updated 8 years ago
- Natural Language Processing with Spark's MLlib☆62Updated 7 years ago
- Material and slides for Boston NLP meetup May 23rd 2016☆17Updated 8 years ago
- Simple validator for submissions to DrivenData competitions☆19Updated 5 years ago
- These are the IPython notebook files for the CSC 432 Spring '13 course.☆23Updated 9 years ago
- Common post-estimation tasks for scikit-learn☆17Updated 8 years ago