Apache (Py)Spark type annotations (stub files).
☆118Aug 17, 2022Updated 3 years ago
Alternatives and similar repositories for pyspark-stubs
Users that are interested in pyspark-stubs are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Asynchronous actions for PySpark☆47Dec 2, 2021Updated 4 years ago
- Spark Gotchas. A subjective compilation of the Apache Spark tips and tricks☆360Jun 6, 2017Updated 8 years ago
- A pyspark lib to validate data quality☆19Nov 11, 2022Updated 3 years ago
- pyspark methods to enhance developer productivity 📣 👯 🎉☆687Mar 6, 2025Updated last year
- Storm Database Explorer - Developing Data Products course project.☆11May 3, 2017Updated 9 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Mirror of Apache Toree (Incubating)☆750May 15, 2026Updated last week
- Jupyter magics and kernels for working with remote Spark clusters☆1,361Sep 9, 2025Updated 8 months ago
- pytest plugin to run the tests with support of pyspark☆88May 21, 2025Updated last year
- ☆11Aug 22, 2023Updated 2 years ago
- Spark style guide☆270Sep 30, 2024Updated last year
- PySpark for ETL jobs including lineage to Apache Atlas in one script via code inspection☆17Jan 12, 2017Updated 9 years ago
- Spark functions to run popular phonetic and string matching algorithms☆60Feb 22, 2022Updated 4 years ago
- A Hivemall wrapper for Spark☆31Apr 21, 2016Updated 10 years ago
- Base classes to use when writing tests with Spark☆1,554Apr 20, 2026Updated last month
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- SparklingGraph provides easy to use set of features that will give you ability to proces large scala graphs using Spark and GraphX.☆154Jul 31, 2020Updated 5 years ago
- Test suite to document the behavior of Spark☆21Apr 15, 2021Updated 5 years ago
- Real-world Spark pipelines examples☆82Feb 27, 2018Updated 8 years ago
- A toolset to streamline running spark python on EMR☆20Nov 16, 2016Updated 9 years ago
- This repository contains the development code for sparkMeasure, an Apache Spark performance analysis and troubleshooting library. It simp…☆823May 19, 2026Updated last week
- A boilerplate for writing PySpark Jobs☆394Jan 21, 2024Updated 2 years ago
- Filling in the Spark function gaps across APIs☆50Apr 14, 2021Updated 5 years ago
- A tool for running Spark on Google Compute Engine☆16Jan 20, 2017Updated 9 years ago
- Apache Spark on Kubernetes☆19Mar 19, 2017Updated 9 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆14Jan 12, 2017Updated 9 years ago
- Parametrize and run scripts as notebooks with jupytext and papermill☆18Sep 29, 2019Updated 6 years ago
- A curated list of awesome Apache Spark packages and resources.☆1,879Feb 27, 2026Updated 3 months ago
- A Spark Atlas connector to track data lineage in Apache Atlas☆268Nov 16, 2022Updated 3 years ago
- Redis search and indexing in Java☆16Sep 26, 2016Updated 9 years ago
- A simple example for PySpark based project.☆11Jun 3, 2016Updated 9 years ago
- Spark + Jupyer + Hive☆16Sep 22, 2015Updated 10 years ago
- CLI tool to launch Spark jobs on AWS EMR☆67Oct 18, 2023Updated 2 years ago
- Spark data profiling utilities☆23Nov 24, 2018Updated 7 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)☆455Apr 2, 2026Updated last month
- ☆524Mar 1, 2026Updated 2 months ago
- low-level helpers for Apache Spark libraries and tests☆16Dec 29, 2018Updated 7 years ago
- Utilities to help HBase as a service in HDInsight Azure☆14Aug 30, 2023Updated 2 years ago
- ☆18Aug 28, 2024Updated last year
- This application comes as Spark2.1-as-Service-Provider using an embedded, Reactive-Streams-based, fully asynchronous HTTP server☆50Jul 16, 2023Updated 2 years ago
- Demonstrates how to submit a job to Spark on HDP directly via YARN's REST API from any workstation☆23Apr 18, 2016Updated 10 years ago