A tutorial that helps Big Data Engineers ramp up faster by getting familiar with PySpark dataframes and functions. It also covers topics like EMR sizing, Google Colaboratory, fine-tuning PySpark jobs, and much more.
☆20Nov 12, 2021Updated 4 years ago
Alternatives and similar repositories for intro-to-colab-pyspark-emr
Users that are interested in intro-to-colab-pyspark-emr are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆18Nov 9, 2025Updated 6 months ago
- ☆14Oct 1, 2022Updated 3 years ago
- Implementation of Spark code in Jupyter notebook. Topics include: RDDs and DataFrame, exploratory data analysis (EDA), handling multiple …☆30Aug 26, 2020Updated 5 years ago
- sql-for-data-engineering-course☆18May 12, 2023Updated 3 years ago
- Reddit Data Science Project Ideas☆11Dec 28, 2019Updated 6 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆10Oct 17, 2021Updated 4 years ago
- This repository contains SQL queries from various popular online learning resources e.g. Vertabelo Academy, SQLZoo etc.☆48Jun 22, 2019Updated 6 years ago
- pdfplot is a Python library for easily managing your matplotlib figures as PDF files.☆13Jul 8, 2020Updated 5 years ago
- https://adventofcode.com/2024☆12Dec 25, 2024Updated last year
- Materials for PyCon 2016 in Portland, Oregon☆10Aug 30, 2015Updated 10 years ago
- A complement to ANTLR to get a model from your AST and transform it☆14Apr 20, 2020Updated 6 years ago
- Docker image that runs a single cron job to sync files with S3 as defined via environment variables☆17Feb 22, 2024Updated 2 years ago
- Official TensorFlow code for the paper "DeepWay: a Deep Learning Waypoint Estimator for Global Path Generation".☆11Jun 24, 2022Updated 3 years ago
- Decorators for logging purposes for all your dataframes☆15Jan 31, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆13Mar 18, 2019Updated 7 years ago
- ☆12Nov 12, 2022Updated 3 years ago
- Dockerfile for audiogrep and pocketsphinx☆12Oct 12, 2016Updated 9 years ago
- Demo repository for running eBPF in GitHub Actions☆23Mar 27, 2025Updated last year
- Repo to host a comprehensive list of all my Public Gists with a short description for each item and a link to the Gist pages in question.…☆16Apr 27, 2021Updated 5 years ago
- Cosine Similary Search in ElasticSearch + FAISS GPU☆12Mar 24, 2022Updated 4 years ago
- CVE database☆21Sep 2, 2020Updated 5 years ago
- Web Scraping and Knowledge Graphs with Machine Learning [Guide]☆10Jul 1, 2021Updated 4 years ago
- ☆13Jun 7, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- A github action for detecting a "trigger" in a pull request description or comment☆13May 16, 2026Updated 2 weeks ago
- Build languages on Python.☆12May 2, 2021Updated 5 years ago
- fabric8-analytics API server☆16May 1, 2023Updated 3 years ago
- The repository contains all the work including projects, notes, and articles related to ML Engineering while I am learning.☆10Dec 4, 2022Updated 3 years ago
- Open-source, knowledge-grounded conversational assistant☆14Jun 30, 2025Updated 11 months ago
- Homepage of Software Engineering for Machine Learning☆17Updated this week
- Automating some of the work of volunteers of covid19india.org☆15May 22, 2023Updated 3 years ago
- Pyspark in Google Colab: A simple machine learning (Linear Regression) model☆39Apr 15, 2019Updated 7 years ago
- ☆58Mar 28, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- MetroMaps Release☆16May 8, 2014Updated 12 years ago
- Vector Database Lite (like SQLITE but for vectors)☆13Jul 10, 2022Updated 3 years ago
- ☆23Apr 3, 2018Updated 8 years ago
- ☆18Oct 21, 2021Updated 4 years ago
- This repository includes my codes for Udemy [“Data Science, Deep Learning, & Machine Learning with Python”](https://www.udemy.com/dat…☆13Aug 27, 2018Updated 7 years ago
- Distributed Training of Bayesian Neural Networks at Scale☆11May 26, 2020Updated 6 years ago
- Create a data visualization using Polymer and WebGL☆15Jun 10, 2017Updated 8 years ago