A tutorial that helps Big Data Engineers ramp up faster by getting familiar with PySpark dataframes and functions. It also covers topics like EMR sizing, Google Colaboratory, fine-tuning PySpark jobs, and much more.
☆20Nov 12, 2021Updated 4 years ago
Alternatives and similar repositories for intro-to-colab-pyspark-emr
Users that are interested in intro-to-colab-pyspark-emr are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆18Nov 9, 2025Updated 7 months ago
- Implementation of Spark code in Jupyter notebook. Topics include: RDDs and DataFrame, exploratory data analysis (EDA), handling multiple …☆30Aug 26, 2020Updated 5 years ago
- Example of a Streamlit data app powered by Vaex☆11Jul 7, 2022Updated 3 years ago
- Simple GUI to load a PDF/Docx/txt file and have LM Studio Answer based off of it.☆14Jul 31, 2024Updated last year
- ☆10Oct 17, 2021Updated 4 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- ☆34Jul 27, 2021Updated 4 years ago
- Bank Marketing data classification☆12Oct 2, 2020Updated 5 years ago
- The open source version of the Amazon EMR Release Guide. You can submit feedback & requests for changes by submitting issues in this repo…☆29Jun 15, 2023Updated 3 years ago
- Peakrs Dataframe is a library and framework facilitates the extraction, transformation, and loading (ETL) of data.☆18Oct 26, 2023Updated 2 years ago
- TuneTables is a tabular classifier that implements prompt tuning for frozen prior-fitted networks.☆24Mar 31, 2025Updated last year
- A complement to ANTLR to get a model from your AST and transform it☆14Apr 20, 2020Updated 6 years ago
- Docker image that runs a single cron job to sync files with S3 as defined via environment variables☆17Feb 22, 2024Updated 2 years ago
- ☆19Jan 20, 2024Updated 2 years ago
- Decorators for logging purposes for all your dataframes☆15Jan 31, 2025Updated last year
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- List of books I have read related to development, user experience design, entrepreneurship, and management☆20Nov 9, 2022Updated 3 years ago
- Dockerfile for audiogrep and pocketsphinx☆12Oct 12, 2016Updated 9 years ago
- Demo repository for running eBPF in GitHub Actions☆23Mar 27, 2025Updated last year
- A platform for storing large semantic networks on MongoDB☆22Jun 20, 2011Updated 14 years ago
- PyTorch Implementation of A Deep Learning System for Predicting Size and Fit in Fashion E-Commerce (RecSys'19)☆14Aug 23, 2021Updated 4 years ago
- Cosine Similary Search in ElasticSearch + FAISS GPU☆12Mar 24, 2022Updated 4 years ago
- CVE database☆21Sep 2, 2020Updated 5 years ago
- Web Scraping and Knowledge Graphs with Machine Learning [Guide]☆10Jul 1, 2021Updated 4 years ago
- ☆13Jun 7, 2024Updated 2 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Control flow graph and test requirement generation for a Java code.☆15Nov 19, 2014Updated 11 years ago
- fabric8-analytics API server☆16May 1, 2023Updated 3 years ago
- ☆16Jul 13, 2022Updated 3 years ago
- Homepage of Software Engineering for Machine Learning☆17May 25, 2026Updated 3 weeks ago
- Course content for Practical AI on the Google Cloud Platform☆11Aug 4, 2020Updated 5 years ago
- Neural network sequence labeling model☆11Dec 28, 2019Updated 6 years ago
- MetroMaps Release☆16May 8, 2014Updated 12 years ago
- BAD: BiAs Detection for Large Language Models in the context of candidate screening (EECS 692)☆12Feb 14, 2024Updated 2 years ago
- Vector Database Lite (like SQLITE but for vectors)☆13Jul 10, 2022Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- This repository contains the codebase mentioned and used in trains' blogs☆11Jun 11, 2026Updated last week
- ☆18Oct 21, 2021Updated 4 years ago
- This repository includes my codes for Udemy [“Data Science, Deep Learning, & Machine Learning with Python”](https://www.udemy.com/dat…☆13Aug 27, 2018Updated 7 years ago
- DocQues answers queries on longer and multiple documents build on GPT-Index and GPT-3☆13Jan 1, 2023Updated 3 years ago
- Metadata browser/editor. QGIS plugin.☆12Dec 25, 2024Updated last year
- ☆14Feb 9, 2022Updated 4 years ago
- Templates for working with CUDA in Nix☆22Nov 1, 2025Updated 7 months ago