AmadeusITGroup / spark-perf-hikesLinks
Performance Hikes for Apache Spark
☆30Updated 2 months ago
Alternatives and similar repositories for spark-perf-hikes
Users that are interested in spark-perf-hikes are comparing it to the libraries listed below
Sorting:
- An sbt plugin to automatically update the release notes file.☆10Updated this week
- Spark style guide☆262Updated 11 months ago
- Code snippets used in demos recorded for the blog.☆38Updated 2 weeks ago
- Custom PySpark Data Sources☆65Updated last week
- This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spa…☆785Updated this week
- Apache Spark Connector for SQL Server and Azure SQL☆287Updated 6 months ago
- The official repository for the Rock the JVM Spark Optimization with Scala course☆58Updated last year
- An example of SparkConnect extension.☆14Updated last year
- Apache Spark testing helpers (dependency free & works with Scalatest, uTest, and MUnit)☆450Updated 3 weeks ago
- Delta Lake Website☆25Updated last week
- Demo of using the Nutter for testing of Databricks notebooks in the CI/CD pipeline☆153Updated last year
- Version 1 of Technical Best Practices of Azure Databricks based on real world Customer and Technical SME inputs☆461Updated last year
- Read and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster.☆293Updated last month
- End-to-end Azure Databricks Workspace automation with Azure Pipelines☆23Updated last year
- Power BI REST API function wrappers for sending Spark data to Power BI Push Datasets☆15Updated 6 years ago
- Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive☆185Updated 2 years ago
- Essential Spark extensions and helper methods ✨😲☆763Updated 2 months ago
- Avro SerDe for Apache Spark structured APIs.☆235Updated 2 months ago
- Flowchart for debugging Spark applications☆107Updated 11 months ago
- A Python Library to support running data quality rules while the spark job is running⚡☆189Updated last week
- Qubole Sparklens tool for performance tuning Apache Spark☆583Updated last year
- A Spark plugin for reading and writing Excel files☆513Updated last week
- Delta Lake helper methods in PySpark☆325Updated last year
- The iterative broadcast join example code.☆70Updated 7 years ago
- Code samples, etc. for Databricks☆65Updated 3 months ago
- ☆95Updated 3 years ago
- This repo is a collection of tools to deploy, manage and operate a Databricks based Lakehouse.☆45Updated 7 months ago
- Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs☆238Updated 6 months ago
- PySpark test helper methods with beautiful error messages☆713Updated last month
- pyspark methods to enhance developer productivity 📣 👯 🎉☆675Updated 6 months ago