Helpers & syntactic sugar for PySpark.
☆62Dec 4, 2025Updated 5 months ago
Alternatives and similar repositories for sparkly
Users that are interested in sparkly are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Spark UDFs to deserialize Avro messages with schemas stored in Schema Registry.☆20Jan 11, 2018Updated 8 years ago
- Collect and aggregate on spark events for profitz☆10Apr 22, 2022Updated 4 years ago
- A dynamic data completeness and accuracy library at enterprise scale for Apache Spark☆30Apr 15, 2026Updated 3 weeks ago
- Load data in BigQuery using Cloud Workflows, Firestore and Cloud Functions.☆11May 12, 2021Updated 4 years ago
- Scala API for Apache Spark SQL high-order functions☆14Aug 4, 2023Updated 2 years ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- Resilient data pipeline framework running on Apache Spark☆28Apr 28, 2026Updated last week
- CoNLL-U format library for Python☆15Apr 7, 2015Updated 11 years ago
- Apache (Py)Spark type annotations (stub files).☆118Aug 17, 2022Updated 3 years ago
- sparkql: Apache Spark SQL DataFrame schema management for sensible humans☆12Sep 18, 2023Updated 2 years ago
- A boilerplate for writing PySpark Jobs☆394Jan 21, 2024Updated 2 years ago
- Demonstrates how to submit a job to Spark on HDP directly via YARN's REST API from any workstation☆23Apr 18, 2016Updated 10 years ago
- PySpark for ETL jobs including lineage to Apache Atlas in one script via code inspection☆17Jan 12, 2017Updated 9 years ago
- Asynchronous actions for PySpark☆48Dec 2, 2021Updated 4 years ago
- HADOOP-CLI is an interactive command line shell that makes interacting with the Hadoop Distribted Filesystem (HDFS) simpler and more intu…☆37Mar 20, 2026Updated last month
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A library that provides useful extensions to Apache Spark and PySpark.☆236Mar 18, 2026Updated last month
- A pyspark lib to validate data quality☆19Nov 11, 2022Updated 3 years ago
- A python package to create a database on the platform using our moj data warehousing framework☆21Mar 16, 2026Updated last month
- Spark Gotchas. A subjective compilation of the Apache Spark tips and tricks☆360Jun 6, 2017Updated 8 years ago
- Qubole Streaminglens tool for tuning Spark Structured Streaming Pipelines☆17Jan 21, 2020Updated 6 years ago
- Spark app to merge different schemas☆23Dec 21, 2020Updated 5 years ago
- pyspark methods to enhance developer productivity 📣 👯 🎉☆687Mar 6, 2025Updated last year
- Record matching and entity resolution at scale in Spark☆36Oct 31, 2023Updated 2 years ago
- command launcher organised in a tree structure with autocompletion☆13May 4, 2022Updated 4 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Real-time query spark and visualise it as graph.☆24Oct 4, 2017Updated 8 years ago
- ☆10Feb 23, 2017Updated 9 years ago
- Exercises for the Slithering Into Elasticsearch tutorial at PyCon 2015☆19Apr 9, 2015Updated 11 years ago
- Universal fuzzy selector for macOs comparable with dmenu☆12Apr 22, 2019Updated 7 years ago
- A place to provide Coiled feedback☆29Mar 5, 2025Updated last year
- A simple elasticsearch frontend for serving astrophysical simulation catalog data☆10Mar 14, 2026Updated last month
- A Scalable Data Cleaning Library for PySpark.☆29Apr 4, 2019Updated 7 years ago
- Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark☆1,534Dec 2, 2024Updated last year
- A CLI to manage and monitor permissions in AWS Lake Formation☆25Feb 8, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Bulletproof Apache Spark jobs with fast root cause analysis of failures.☆73Mar 14, 2021Updated 5 years ago
- A curated list of awesome Apache Spark packages and resources.☆1,878Feb 27, 2026Updated 2 months ago
- ACID and BASE transactions explained☆15May 18, 2025Updated 11 months ago
- Run FeatureTools to automate Feature Engineering distributionally on Spark.☆11Oct 11, 2018Updated 7 years ago
- ☆19Oct 29, 2014Updated 11 years ago
- something to help you spark☆65Oct 23, 2018Updated 7 years ago
- An OpenCalais API Interface for Python.☆21Mar 13, 2012Updated 14 years ago