gautamsm / data-science-on-mpp
A collection of examples illustrating data processing, data science, and machine learning on the Pivotal Greenplum and HAWQ MPP databases
☆20Updated 8 years ago
Alternatives and similar repositories for data-science-on-mpp:
Users that are interested in data-science-on-mpp are comparing it to the libraries listed below
- A simple example of containerized data science with python and Docker.☆51Updated 7 years ago
- feng - feature engineering for machine-learning champions☆27Updated 8 years ago
- ☆41Updated 7 years ago
- Simplified tree-based classifier and regressor for interpretable machine learning (scikit-learn compatible)☆47Updated 4 years ago
- Demo code contrasting Google Dataflow (Apache Beam) with Apache Spark☆14Updated 8 years ago
- A simple python wrapper over MLJAR API.☆42Updated 2 years ago
- Invoke Pandas plotting by piping in SQL output via PSQL (Can be used with Postgres or Greenplum or any SQL engine).☆16Updated 10 years ago
- Spark library for doing exploratory data analysis in a scalable way☆43Updated 9 years ago
- This project contains the code to translate between Apache Spark and SFrame.☆20Updated 8 years ago
- [RETIRED] Converts a notebook to a dashboard and deploys it / downloads it☆79Updated 7 years ago
- Collection of dask example notebooks☆58Updated 7 years ago
- A Topic Modeling toolbox☆92Updated 8 years ago
- Fast, easy and intuitive machine learning prototyping.☆124Updated 10 years ago
- Some wrappers around python modules for simplifying the data exploration process.☆13Updated 4 months ago
- Deprecated, please use https://github.com/jcrist/skein or https://github.com/dask/dask-yarn instead☆52Updated 6 years ago
- Training materials for Strata, AMP Camp, etc☆149Updated 9 years ago
- Tools for performing hyperparameter search with Scikit-Learn and Dask http://dask-searchcv.readthedocs.io☆11Updated 7 years ago
- A Machine Learning API with native redis caching and export + import using S3. Analyze entire datasets using an API for building, trainin…☆100Updated 2 years ago
- personal cheatsheets on various technologies☆25Updated 8 years ago
- Probabilistic Data Structures in Python (originally presented at PyData 2013)☆55Updated 3 years ago
- Pydata NYC 2014 Scikit Learn Tutorial☆64Updated 10 years ago
- Dato/Turi DS Conf talk on NLP and Elasticsearch analysis of reviews, plus JS implementation☆45Updated 8 years ago
- Sample applications built using AWS' Amazon Machine Learning.☆51Updated 7 years ago
- My talk at Strata 2014 in Santa Clara, CA☆73Updated 11 years ago
- A global, black box optimization engine for real world metric optimization.☆66Updated 10 years ago
- PMML evaluator library for the PostgreSQL database (http://www.postgresql.org/)☆11Updated 10 years ago
- A Python wrapper for MADlib(http://madlib.net) - an open source library for scalable in-database machine learning algorithms☆63Updated 4 years ago
- Deploy sentiment analysis using Flask☆17Updated 5 years ago
- Using Pandas easily with Cassandra☆23Updated 7 years ago
- Experimental parallel data analysis toolkit.☆121Updated 3 years ago