PiercingDan / spark-Jupyter-AWS
A guide on how to set up Jupyter with Pyspark painlessly on AWS EC2 clusters, with S3 I/O support
☆261Updated 7 years ago
Alternatives and similar repositories for spark-Jupyter-AWS:
Users that are interested in spark-Jupyter-AWS are comparing it to the libraries listed below
- Content for architecting a data science platform for products using Luigi, Spark & Flask.☆163Updated 5 years ago
- ☆263Updated 5 years ago
- ☆84Updated 7 years ago
- An external PySpark module that works like R's read.csv or Panda's read_csv, with automatic type inference and null value handling. Parse…☆90Updated 9 years ago
- Deep Learning for Pugs☆74Updated 7 years ago
- Example unit tests for Apache Spark Python scripts using the py.test framework☆84Updated 9 years ago
- VM based deployment for prototyping Big Data tools on Amazon Web Services☆128Updated 4 years ago
- Quickstart PySpark with Anaconda on AWS/EMR☆53Updated 8 years ago
- Curated list of all dataset websites that I find☆84Updated 6 years ago
- PyData NYC 2015 conference☆94Updated 9 years ago
- Sample repo for luigi tasks & config☆36Updated 8 years ago
- Magic functions for using Jupyter Notebook with Apache Spark and a variety of SQL databases.☆171Updated 6 years ago
- PyData Seattle 2015: Python Data Bikeshed☆127Updated 9 years ago
- Repository for PyCon 2016 workshop Natural Language Processing in 10 Lines of Code☆239Updated 7 years ago
- Model assisted random sampling.☆120Updated 4 years ago
- Directory of Jupyter notebooks exploring various topics☆316Updated 8 years ago
- Observations from Ian on successfully delivering data science products☆543Updated 3 years ago
- Arbalest is a Python data pipeline orchestration library for Amazon S3 and Amazon Redshift. It automates data import into Redshift and ma…☆41Updated 9 years ago
- This is a repo documenting the best practices in PySpark.☆463Updated 2 years ago
- ☆317Updated 4 years ago
- DePy 2015 Talk☆117Updated 7 years ago
- Analyze the structure and dynamics of an open source project's developer community, using graph algorithms, etc.☆58Updated 4 years ago
- Standard evaluations for binary classifiers so you don't have to☆314Updated 6 years ago
- DonorsChoose.org Data Science Team Opensource Code☆77Updated 2 years ago
- ☆216Updated 5 years ago
- Algorithm's team Jupyter Notebooks☆113Updated 8 years ago
- ☆160Updated 8 years ago
- A pure Python implementation of Apache Spark's RDD and DStream interfaces.☆268Updated 8 months ago
- ☆34Updated 9 years ago
- Open source Flotilla☆193Updated this week