Filling in the Spark function gaps across APIs
☆50Apr 14, 2021Updated 4 years ago
Alternatives and similar repositories for bebe
Users that are interested in bebe are comparing it to the libraries listed below
Sorting:
- A library that brings useful functions from various modern database management systems to Apache Spark☆61Sep 4, 2023Updated 2 years ago
- Write property based tests easily on spark dataframes☆20Jan 19, 2024Updated 2 years ago
- Essential Spark extensions and helper methods ✨😲☆766Sep 14, 2025Updated 5 months ago
- Spark functions to run popular phonetic and string matching algorithms☆59Feb 22, 2022Updated 4 years ago
- JumpSpark - A modern cookiecutter template for pyspark projects with batteries included.☆10May 12, 2023Updated 2 years ago
- Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive☆187Oct 15, 2025Updated 4 months ago
- Spark-Radiant is Apache Spark Performance and Cost Optimizer☆25Dec 31, 2024Updated last year
- Magic to help Spark pipelines upgrade☆34Sep 29, 2024Updated last year
- Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)☆454Feb 8, 2026Updated 3 weeks ago
- This is a mirror of https://github.com/LucaCanali/sparkMeasure - sparkMeasure is a tool for performance troubleshooting of Apache Spark w…☆16Oct 3, 2025Updated 5 months ago
- Spark SQL DBF Library☆16Jan 2, 2015Updated 11 years ago
- Task Metrics Explorer☆14Apr 2, 2019Updated 6 years ago
- A library for Spark DataFrame using MinIO Select API☆101Sep 27, 2019Updated 6 years ago
- Spark style guide☆272Sep 30, 2024Updated last year
- Set-oriented Operations in Pandas☆24May 27, 2020Updated 5 years ago
- Paper: A Zero-rename committer for object stores☆20Nov 7, 2025Updated 3 months ago
- Fuzzy matching function in spark (https://spark-packages.org/package/itspawanbhardwaj/spark-fuzzy-matching)☆24Dec 30, 2019Updated 6 years ago
- pyspark methods to enhance developer productivity 📣 👯 🎉☆683Mar 6, 2025Updated 11 months ago
- Declarative Data Viz 4 Scala☆53Jan 26, 2026Updated last month
- Optics for Spark DataFrames☆47Mar 5, 2021Updated 4 years ago
- Spark data profiling utilities☆23Nov 24, 2018Updated 7 years ago
- A Clojure SAML 2.0 library for SSO☆23Jan 29, 2020Updated 6 years ago
- Scala 3.x wrapper for Apache Flink☆50Mar 5, 2023Updated 2 years ago
- ☆106Jun 25, 2025Updated 8 months ago
- A high performance rule induction algorithm (RIPPERk).☆29Aug 25, 2013Updated 12 years ago
- GNU APL native interop for Clojure☆29Mar 18, 2022Updated 3 years ago
- spark-sight: Spark performance at a glance☆10Apr 6, 2023Updated 2 years ago
- Apache (Py)Spark type annotations (stub files).☆118Aug 17, 2022Updated 3 years ago
- Bulletproof Apache Spark jobs with fast root cause analysis of failures.☆73Mar 14, 2021Updated 4 years ago
- IPython notebook storage on OpenStack clouds☆58Nov 28, 2018Updated 7 years ago
- 适合2到6岁的宝宝打字游戏☆10May 29, 2020Updated 5 years ago
- ☆10Jul 1, 2022Updated 3 years ago
- A Minimalistic Rust Implementation of Delta Sharing Server.☆98Mar 17, 2025Updated 11 months ago
- PDF to JSON, JSON to PDF and etc.☆12Apr 18, 2018Updated 7 years ago
- Spark Structured Streaming State Tools☆34Jul 3, 2020Updated 5 years ago
- A library that provides useful extensions to Apache Spark and PySpark.☆232Jan 20, 2026Updated last month
- This project has customization likes custom data sources, plugins written for the distributed systems like Apache Spark, Apache Ignite et…☆34Oct 6, 2023Updated 2 years ago
- Schema Registry integration for Apache Spark☆40Nov 16, 2022Updated 3 years ago
- Time Series library for Scala☆37Jul 31, 2022Updated 3 years ago