jmankoff / data
The repository for the CMU Data Pipeline course. This year's course should use branch 2017
☆40Updated 7 years ago
Alternatives and similar repositories for data:
Users that are interested in data are comparing it to the libraries listed below
- Data and code for "Fast Data Applications with Spark and Python"☆25Updated 8 years ago
- This repository contains code files specifically IPython notebooks for the assignments in the course "Scalable Machine Learning" by UC Be…☆30Updated 9 years ago
- Code & Data for Introduction to Machine Learning with Scikit-Learn☆81Updated 6 years ago
- tutorials and samples that show you how get the most out of IBM Analytics for Apache Spark☆79Updated 7 years ago
- Code for the Kaggle acquire valued shoppers challenge☆66Updated 11 years ago
- Repo containing problem set solutions and other resources associated with Stanford’s course CS224d: Deep Learning for Natural Language Pr…☆15Updated 8 years ago
- Pydata NYC 2014 Scikit Learn Tutorial☆64Updated 10 years ago
- Supporting content (slides and exercises) for the Addison-Wesley (Pearson) video series covering best practices for developing scalable S…☆66Updated 9 years ago
- ☆66Updated last year
- Real-time Machine Learning with Apache Spark on Twitter Public Stream☆68Updated 8 years ago
- Some notebook examples related to Apache Spark, IPython / Jupyter, Zeppelin☆52Updated 8 years ago
- Code example to predict prices of Airbnb vacation rentals, using scikit-learn on Spark with spark-sklearn, on MapR.☆44Updated 8 years ago
- Simple demonstration of how to build a complex real time machine learning visualization tool.☆16Updated 9 years ago
- A simple introduction to using spark ml pipelines☆26Updated 7 years ago
- Word2Vec models with Twitter data using Spark. Blog:☆65Updated 6 years ago
- Code and Notebooks for the Natural Language Processing with Python course.☆66Updated 7 years ago
- Feature Engineering with Pipeline Talk at ODSC West 2016, Santa Clara☆17Updated 8 years ago
- Materials fort Strata NYC 2016 scikit-learn tutorial☆15Updated 8 years ago
- Pydata Dallas 2015 Scikit-Learn Tutorial☆62Updated 10 years ago
- This project contains the code to translate between Apache Spark and SFrame.☆20Updated 8 years ago
- EuroScipy 2014 tutorial: Introduction to predictive analytics with pandas and scikit-learn☆84Updated 10 years ago
- Spark in Kaggle competitions☆10Updated 9 years ago
- PyTennessee 2014: Statistical Data Analysis in Python☆85Updated 10 years ago
- Jupyter notebook containing code from text preprocessing blog post☆10Updated 8 years ago
- Modeling Social Data, Applied Mathematics, Columbia University (Spring 2015)☆33Updated 5 years ago
- Training materials for Strata, AMP Camp, etc☆149Updated 9 years ago
- This repository contains code files specifically IPython notebooks for the assignments in the course "Introduction to Big Data with Apach…☆115Updated 8 months ago
- Data Science in 30 Minutes #5: Spark☆19Updated 8 years ago
- Data directory for the CS109 Data Science course☆66Updated 10 years ago
- Material for ODSCON San Francisco 2015☆79Updated 9 years ago