BlueGranite / tpc-ds-dataset-generator
Generate big TPC-DS datasets with Databricks
☆18Updated 3 years ago
Alternatives and similar repositories for tpc-ds-dataset-generator:
Users that are interested in tpc-ds-dataset-generator are comparing it to the libraries listed below
- TPCDS benchmark for various engines☆18Updated 3 years ago
- This repo is a collection of tools to deploy, manage and operate a Databricks based Lakehouse.☆45Updated 2 months ago
- An Azure Databricks workshop leveraging the New York Taxi and Limousine Commission Trip Records dataset☆108Updated last year
- Databricks Migration Tools☆43Updated 3 years ago
- Examples surrounding Databricks.☆57Updated 9 months ago
- ☆76Updated 10 months ago
- dbt adapter for Azure Synapse Dedicated SQL Pools☆71Updated last week
- Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs☆235Updated 2 months ago
- A Spark connector for the Azure Common Data Model☆15Updated last year
- A library that brings useful functions from various modern database management systems to Apache Spark☆58Updated last year
- Example code for doing DataOps☆47Updated 4 years ago
- Sample processing code using Spark 2.1+ and Scala☆51Updated 4 years ago
- Yet Another (Spark) ETL Framework☆20Updated last year
- Make your libraries magically appear in Databricks.☆47Updated last year
- type-class based data cleansing library for Apache Spark SQL☆78Updated 5 years ago
- Azure Deployments using Terraform☆30Updated 2 years ago
- My Study guide used to pass the CRT020 Spark Certification exam☆33Updated 5 years ago
- Azure SQL and Databricks samples and best practices for loading data quickly and efficiently☆33Updated 4 years ago
- Bulletproof Apache Spark jobs with fast root cause analysis of failures.☆72Updated 4 years ago
- A bunch of hacks developed around dbt☆48Updated 5 years ago
- A tool to validate data, built around Apache Spark.☆101Updated 3 weeks ago
- Magic to help Spark pipelines upgrade☆34Updated 6 months ago
- Monitoring Azure Databricks jobs☆223Updated 6 months ago
- SQL Queries & Alerts for Databricks System Tables access.audit Logs☆26Updated 6 months ago
- Nested Data (JSON/AVRO/XML) Parsing and Flattening in Spark☆16Updated last year
- Snowflake Data Source for Apache Spark.☆223Updated 2 weeks ago
- Building a real-time alert monitoring pipeline that sends email notifications off of Azure Event Hubs, Azure Databricks, and a Azure Logi…☆13Updated 5 years ago
- Databricks Platform - Architecture, Security, Automation and much more!!☆50Updated last week
- Unity Catalog UI☆40Updated 7 months ago
- Custom PySpark Data Sources☆42Updated this week