A tutorial that helps Big Data Engineers ramp up faster by getting familiar with PySpark dataframes and functions. It also covers topics like EMR sizing, Google Colaboratory, fine-tuning PySpark jobs, and much more.
☆20Nov 12, 2021Updated 4 years ago
Alternatives and similar repositories for intro-to-colab-pyspark-emr
Users that are interested in intro-to-colab-pyspark-emr are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆18Nov 9, 2025Updated 5 months ago
- Source Code for 'Applied Data Science Using PySpark' by Ramcharan Kakarla, Sundar Krishnan, and Sridhar Alla☆48May 18, 2021Updated 4 years ago
- Implementation of Spark code in Jupyter notebook. Topics include: RDDs and DataFrame, exploratory data analysis (EDA), handling multiple …☆30Aug 26, 2020Updated 5 years ago
- Reddit Data Science Project Ideas☆11Dec 28, 2019Updated 6 years ago
- ☆10Oct 17, 2021Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Named entity relevant project☆30Aug 1, 2020Updated 5 years ago
- Personal notes for the book Building Microservices by Sam Newman☆16Oct 3, 2020Updated 5 years ago
- Docker image that runs a single cron job to sync files with S3 as defined via environment variables☆17Feb 22, 2024Updated 2 years ago
- Official TensorFlow code for the paper "DeepWay: a Deep Learning Waypoint Estimator for Global Path Generation".☆11Jun 24, 2022Updated 3 years ago
- ☆13Mar 18, 2019Updated 7 years ago
- Starter template for python projects☆18Feb 15, 2024Updated 2 years ago
- List of books I have read related to development, user experience design, entrepreneurship, and management☆20Nov 9, 2022Updated 3 years ago
- Dockerfile for audiogrep and pocketsphinx☆12Oct 12, 2016Updated 9 years ago
- Demo repository for running eBPF in GitHub Actions☆23Mar 27, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Repo to host a comprehensive list of all my Public Gists with a short description for each item and a link to the Gist pages in question.…☆16Apr 27, 2021Updated 4 years ago
- Cosine Similary Search in ElasticSearch + FAISS GPU☆12Mar 24, 2022Updated 4 years ago
- Web Scraping and Knowledge Graphs with Machine Learning [Guide]☆10Jul 1, 2021Updated 4 years ago
- Build languages on Python.☆12May 2, 2021Updated 4 years ago
- Control flow graph and test requirement generation for a Java code.☆14Nov 19, 2014Updated 11 years ago
- The repository contains all the work including projects, notes, and articles related to ML Engineering while I am learning.☆10Dec 4, 2022Updated 3 years ago
- Ensemble of ARIMA, prophet and LSTMS RNN☆35Aug 26, 2017Updated 8 years ago
- ☆16Jul 13, 2022Updated 3 years ago
- ☆10Nov 25, 2022Updated 3 years ago
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Automating some of the work of volunteers of covid19india.org☆15May 22, 2023Updated 2 years ago
- Pyspark in Google Colab: A simple machine learning (Linear Regression) model☆39Apr 15, 2019Updated 7 years ago
- CapsNet implementation in a minimal manner☆11Nov 17, 2017Updated 8 years ago
- The classification goal is to predict if the client will subscribe a term deposit (variable y).☆18Jan 29, 2018Updated 8 years ago
- Python wrapper for Google Maps JavaScript API V3 and Google Earth API.☆17Sep 13, 2014Updated 11 years ago
- This repository contain approx. 80+ FREE Courses for Data Science☆19Jun 2, 2022Updated 3 years ago
- MetroMaps Release☆16May 8, 2014Updated 11 years ago
- BAD: BiAs Detection for Large Language Models in the context of candidate screening (EECS 692)☆12Feb 14, 2024Updated 2 years ago
- Vector Database Lite (like SQLITE but for vectors)☆13Jul 10, 2022Updated 3 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- This repository contains the codebase mentioned and used in trains' blogs☆11Jul 25, 2025Updated 8 months ago
- Colab "Jukebox: A Generative Model for Music"☆16Jun 14, 2020Updated 5 years ago
- Workshop materials for AI Engineer World's Fair☆16Jun 3, 2025Updated 10 months ago
- ☆18Oct 21, 2021Updated 4 years ago
- This repository includes my codes for Udemy [“Data Science, Deep Learning, & Machine Learning with Python”](https://www.udemy.com/dat…☆13Aug 27, 2018Updated 7 years ago
- DocQues answers queries on longer and multiple documents build on GPT-Index and GPT-3☆13Jan 1, 2023Updated 3 years ago
- Distributed Training of Bayesian Neural Networks at Scale☆11May 26, 2020Updated 5 years ago