Helpers & syntactic sugar for PySpark.
☆62Dec 4, 2025Updated 4 months ago
Alternatives and similar repositories for sparkly
Users that are interested in sparkly are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Collect and aggregate on spark events for profitz☆10Apr 22, 2022Updated 3 years ago
- A low-overhead sampling profiler for PySpark, that outputs Flame Graphs☆16Dec 17, 2020Updated 5 years ago
- Scala API for Apache Spark SQL high-order functions☆14Aug 4, 2023Updated 2 years ago
- Resilient data pipeline framework running on Apache Spark☆27Updated this week
- CoNLL-U format library for Python☆15Apr 7, 2015Updated 11 years ago
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Functional approach to query database using SQLAlchemy☆22May 12, 2020Updated 5 years ago
- Apache (Py)Spark type annotations (stub files).☆118Aug 17, 2022Updated 3 years ago
- A boilerplate for writing PySpark Jobs☆395Jan 21, 2024Updated 2 years ago
- Demonstrates how to submit a job to Spark on HDP directly via YARN's REST API from any workstation☆23Apr 18, 2016Updated 10 years ago
- PySpark for ETL jobs including lineage to Apache Atlas in one script via code inspection☆17Jan 12, 2017Updated 9 years ago
- Asynchronous actions for PySpark☆48Dec 2, 2021Updated 4 years ago
- libpostal wrapper python package for windows☆18Aug 12, 2023Updated 2 years ago
- A library that provides useful extensions to Apache Spark and PySpark.☆236Mar 18, 2026Updated last month
- A python package to create a database on the platform using our moj data warehousing framework☆21Mar 16, 2026Updated last month
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- An infinite canvas built for brainstorming.☆13Jul 31, 2024Updated last year
- Spark Gotchas. A subjective compilation of the Apache Spark tips and tricks☆360Jun 6, 2017Updated 8 years ago
- Qubole Streaminglens tool for tuning Spark Structured Streaming Pipelines☆17Jan 21, 2020Updated 6 years ago
- An idiomatic Rust wrapper for the V8 Javascript engine☆12Sep 7, 2018Updated 7 years ago
- Web app for streamhut☆15Jul 12, 2020Updated 5 years ago
- Spark app to merge different schemas☆23Dec 21, 2020Updated 5 years ago
- pyspark methods to enhance developer productivity 📣 👯 🎉☆687Mar 6, 2025Updated last year
- Record matching and entity resolution at scale in Spark☆36Oct 31, 2023Updated 2 years ago
- command launcher organised in a tree structure with autocompletion☆13May 4, 2022Updated 3 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Real-time query spark and visualise it as graph.☆24Oct 4, 2017Updated 8 years ago
- ☆13Jun 14, 2017Updated 8 years ago
- An implementation of Kensler's hashed permutation algorithm☆17Jan 31, 2025Updated last year
- Sketch and LSH Index library for Java, including OPH methods as well as the Lazo method☆15Dec 24, 2023Updated 2 years ago
- some helpers to create swagger output from a pecan app☆10Aug 12, 2016Updated 9 years ago
- Universal fuzzy selector for macOs comparable with dmenu☆12Apr 22, 2019Updated 6 years ago
- A place to provide Coiled feedback☆29Mar 5, 2025Updated last year
- ☆13Feb 11, 2019Updated 7 years ago
- A simple elasticsearch frontend for serving astrophysical simulation catalog data☆10Mar 14, 2026Updated last month
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆18Updated this week
- [ESEC/FSE'23] Hue: A User-Adaptive Parser for Hybrid Logs☆10Aug 24, 2023Updated 2 years ago
- Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark☆1,536Dec 2, 2024Updated last year
- A CLI to manage and monitor permissions in AWS Lake Formation☆25Feb 8, 2023Updated 3 years ago
- Bulletproof Apache Spark jobs with fast root cause analysis of failures.☆73Mar 14, 2021Updated 5 years ago
- A curated list of awesome Apache Spark packages and resources.☆1,872Feb 27, 2026Updated last month
- A command-line interface for packaging, deploying, and running your EMR Serverless Spark jobs☆47May 10, 2024Updated last year