Quickstart PySpark with Anaconda on AWS/EMR
☆52Jan 9, 2017Updated 9 years ago
Alternatives and similar repositories for emr-bootstrap-pyspark
Users that are interested in emr-bootstrap-pyspark are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A toolset to streamline running spark python on EMR☆20Nov 16, 2016Updated 9 years ago
- Deploy sentiment analysis using Flask☆17Oct 27, 2019Updated 6 years ago
- ☆11Oct 11, 2022Updated 3 years ago
- A simple Spark TDD example☆26Sep 19, 2017Updated 8 years ago
- Resources and Materials for MATLAB Probability class☆10Oct 23, 2015Updated 10 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Code for my paper "Fixed-Form Variational Posterior Approximation through Stochastic Linear Regression"☆11Sep 15, 2013Updated 12 years ago
- Code supporting Data Science articles at The Marketing Technologist, Floryn Tech Blog, and Pythom.nl☆71Mar 17, 2023Updated 3 years ago
- Automating LTV Percentage☆10Jun 7, 2021Updated 4 years ago
- ☆10May 11, 2019Updated 6 years ago
- ☆16Jun 27, 2020Updated 5 years ago
- An R package to gather, munge, and convert event datasets into temporal event-networks.☆11Mar 28, 2018Updated 8 years ago
- A Gentle introduction to Machine Learning with Apache Spark☆11Mar 2, 2026Updated 3 weeks ago
- aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-firs…☆11Jun 10, 2015Updated 10 years ago
- This code demonstrates the architecture featured on the AWS Big Data blog (https://aws.amazon.com/blogs/big-data/ ) which creates a concu…☆76Oct 30, 2018Updated 7 years ago
- Wordpress hosting with auto-scaling on Cloudways • AdFully Managed hosting built for WordPress-powered businesses that need reliable, auto-scalable hosting. Cloudways SafeUpdates now available.
- Sample notebooks for using the Global Database of Events, Language and Tone (GDELT).☆19Nov 8, 2020Updated 5 years ago
- An Exploratory Data Analysis on the World Bank Dataset.☆13Apr 8, 2020Updated 5 years ago
- PETRARCH actor, agent and verb dictionaries☆22Aug 3, 2018Updated 7 years ago
- A set of widgets for Python's Orange Machine Learning to work with Apache Spark ML☆15Dec 24, 2016Updated 9 years ago
- A Terraform module to create an Amazon Web Services (AWS) Elastic MapReduce (EMR) cluster.☆39Oct 21, 2019Updated 6 years ago
- Topic and sentiment analysis of tweets (demo)☆11Mar 21, 2019Updated 7 years ago
- ☆12Oct 16, 2023Updated 2 years ago
- In-database parallel grid-search for XGBoost on Greenplum☆15Mar 1, 2018Updated 8 years ago
- Collection of tutorials on text analytics/NLP, including vector space models, neural language models and topic models on the Pivotal MPP …☆17Apr 5, 2016Updated 9 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Appendix☆14Apr 16, 2015Updated 10 years ago
- Basic Spark utilities☆13Feb 20, 2025Updated last year
- ELK 튜토리얼☆11Mar 15, 2023Updated 3 years ago
- Quantitative analysis for traders on Oslo Stock Exchange. Download, plot and play with data from Oslo Børs and Nasdaq OMX☆10Jul 28, 2018Updated 7 years ago
- A pyspark lib to validate data quality☆18Nov 11, 2022Updated 3 years ago
- API REST boilerplate using Spring Boot and Redis as database☆13Dec 26, 2018Updated 7 years ago
- Due to lack of resources on how to deploy kafka with simple SASL authentication (just username and password) and how to write producer an…☆12Dec 29, 2021Updated 4 years ago
- ☆22Jun 19, 2020Updated 5 years ago
- Talk to your computer. You know you want to.☆11Mar 13, 2016Updated 10 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Tools for faster and optimized interaction with Teradata and large datasets.☆17Jul 11, 2018Updated 7 years ago
- Auto Service discovery using Eureka and Zuul without Eureka client☆10Apr 4, 2019Updated 6 years ago
- User-friendly billing for communal households☆12Jan 6, 2022Updated 4 years ago
- Terminal multiplexer library. The core library for pymux, which is a pure Python tmux clone.☆12Feb 20, 2014Updated 12 years ago
- Docker container to make running Luigi tasks real easy.☆11Aug 31, 2016Updated 9 years ago
- My Tutorial for PyData London☆26Jun 18, 2015Updated 10 years ago
- TwsMongo is an example of integration between Mongodb and InteractiveBrokers.com API v.9.66 written in C++. The goal of this application …☆18Aug 10, 2023Updated 2 years ago