Apache (Py)Spark type annotations (stub files).
☆118Aug 17, 2022Updated 3 years ago
Alternatives and similar repositories for pyspark-stubs
Users that are interested in pyspark-stubs are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Asynchronous actions for PySpark☆48Dec 2, 2021Updated 4 years ago
- Spark Gotchas. A subjective compilation of the Apache Spark tips and tricks☆360Jun 6, 2017Updated 8 years ago
- A pyspark lib to validate data quality☆19Nov 11, 2022Updated 3 years ago
- ☆16May 31, 2017Updated 8 years ago
- Storm Database Explorer - Developing Data Products course project.☆11May 3, 2017Updated 9 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Schema Registry integration for Apache Spark☆40Nov 16, 2022Updated 3 years ago
- Manage Apache Atlas and Ranger configuration for your Hadoop environment.☆16May 4, 2021Updated 5 years ago
- Mirror of Apache Toree (Incubating)☆750Apr 2, 2026Updated last month
- Jupyter magics and kernels for working with remote Spark clusters☆1,360Sep 9, 2025Updated 7 months ago
- pytest plugin to run the tests with support of pyspark☆88May 21, 2025Updated 11 months ago
- ☆11Aug 22, 2023Updated 2 years ago
- Spark style guide☆271Sep 30, 2024Updated last year
- PySpark for ETL jobs including lineage to Apache Atlas in one script via code inspection☆17Jan 12, 2017Updated 9 years ago
- Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark☆1,534Dec 2, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Spark functions to run popular phonetic and string matching algorithms☆59Feb 22, 2022Updated 4 years ago
- A Hivemall wrapper for Spark☆31Apr 21, 2016Updated 10 years ago
- Base classes to use when writing tests with Spark☆1,553Apr 20, 2026Updated 2 weeks ago
- SparklingGraph provides easy to use set of features that will give you ability to proces large scala graphs using Spark and GraphX.☆154Jul 31, 2020Updated 5 years ago
- Test suite to document the behavior of Spark☆21Apr 15, 2021Updated 5 years ago
- Real-world Spark pipelines examples☆82Feb 27, 2018Updated 8 years ago
- CLI Based Browser for S3 Buckets☆14Aug 12, 2016Updated 9 years ago
- A toolset to streamline running spark python on EMR☆20Nov 16, 2016Updated 9 years ago
- This repository contains the development code for sparkMeasure, an Apache Spark performance analysis and troubleshooting library. It simp…☆821Apr 24, 2026Updated last week
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A boilerplate for writing PySpark Jobs☆394Jan 21, 2024Updated 2 years ago
- Filling in the Spark function gaps across APIs☆50Apr 14, 2021Updated 5 years ago
- Apache Spark Website☆136Updated this week
- A tool for running Spark on Google Compute Engine☆16Jan 20, 2017Updated 9 years ago
- ☆14Jan 12, 2017Updated 9 years ago
- A curated list of awesome Apache Spark packages and resources.☆1,876Feb 27, 2026Updated 2 months ago
- Redis search and indexing in Java☆16Sep 26, 2016Updated 9 years ago
- A connector for SingleStore and Spark☆162Apr 17, 2026Updated 2 weeks ago
- CLI tool to launch Spark jobs on AWS EMR☆67Oct 18, 2023Updated 2 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A client for the Confluent Schema Registry API implemented in Python☆53Mar 18, 2023Updated 3 years ago
- Spark data profiling utilities☆23Nov 24, 2018Updated 7 years ago
- Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)☆455Apr 2, 2026Updated last month
- ☆524Mar 1, 2026Updated 2 months ago
- Interactive Audience Analytics with Spark and HyperLogLog☆55Oct 14, 2015Updated 10 years ago
- low-level helpers for Apache Spark libraries and tests☆16Dec 29, 2018Updated 7 years ago
- This application comes as Spark2.1-as-Service-Provider using an embedded, Reactive-Streams-based, fully asynchronous HTTP server☆50Jul 16, 2023Updated 2 years ago