how to unit test your PySpark code
☆29Mar 26, 2021Updated 4 years ago
Alternatives and similar repositories for unitTestPySpark
Users that are interested in unitTestPySpark are comparing it to the libraries listed below
Sorting:
- A simple and easy to use Data Quality (DQ) tool built with Python.☆51Sep 7, 2023Updated 2 years ago
- Sample Project to Learn Data Engineering☆10Aug 1, 2021Updated 4 years ago
- Sample code for getting started reverse-terraforming Snowflake☆17May 12, 2023Updated 2 years ago
- ☆13Mar 5, 2023Updated 3 years ago
- Template for Data Engineering and Data Pipeline projects☆117Jan 1, 2023Updated 3 years ago
- 🚂 Fine-tune OpenAI models for text classification, question answering, and more☆17May 1, 2023Updated 2 years ago
- Some example projects for Data Engineers to build, end-to-end.☆38Nov 8, 2023Updated 2 years ago
- Get Crypto data from API, stream it to Kafka with Airflow. Write data to MySQL and visualize with Metabase☆17Oct 2, 2023Updated 2 years ago
- Data Engineer Project: An end-to-end Airflow data pipeline with BigQuery, dbt Soda, and more!☆12Dec 14, 2023Updated 2 years ago
- Hadoop/Hive/Spark container to perform CI tests☆10Dec 26, 2020Updated 5 years ago
- ☆10Nov 28, 2022Updated 3 years ago
- ☆22Nov 30, 2022Updated 3 years ago
- Docker Apache Airflow☆13Mar 1, 2023Updated 3 years ago
- Welcome to my data engineering projects repository! Here you will find a collection of data engineering projects that I have worked on.☆24Apr 27, 2023Updated 2 years ago
- ☆11Jan 24, 2023Updated 3 years ago
- A simple, customizable, and modern library for displaying alert banners in your Jetpack Compose and Compose Multiplatform applications.☆42Aug 17, 2025Updated 7 months ago
- Genie Framework improves Spark Pool utilization by executing multiple Synapse notebooks on the same spark pool instance☆28Dec 19, 2023Updated 2 years ago
- A CALDERA plugin☆18Jul 28, 2020Updated 5 years ago
- ☆15Feb 9, 2026Updated last month
- a tool for defining repeatable processes in code☆13Oct 29, 2019Updated 6 years ago
- NeurIPS 2020 Spotlight Paper☆13Dec 20, 2021Updated 4 years ago
- create issues from pytest-reportlog files☆13Feb 10, 2026Updated last month
- Library for fast text representation and classification.☆10Apr 17, 2022Updated 3 years ago
- Tag-based masking policies management in Snowflake ❄️ 🏷️☆26Mar 5, 2026Updated 2 weeks ago
- Dbt package for Apache Airflow inspired macros☆17Dec 21, 2025Updated 2 months ago
- Design and implementation of FAIR Data Cube☆11Jun 2, 2025Updated 9 months ago
- A golang based prometheus metrics exporter for Snowflake.☆20Dec 17, 2025Updated 3 months ago
- Cloud-native Trino (prestosql) + Hive + Minio + Superset☆24Nov 29, 2021Updated 4 years ago
- Python package for Bayesian & Frequentist A/B Testing☆12Jul 6, 2023Updated 2 years ago
- Repository for participants of the "Containers for HPC" training☆11Feb 11, 2026Updated last month
- NIU website on common software problems and their troubleshooting☆10Feb 13, 2026Updated last month
- ☆13Oct 24, 2018Updated 7 years ago
- mlmodels : Machine Learning and Deep Learning Model ZOO for Pytorch, Tensorflow, Keras, Gluon models...☆10Oct 23, 2020Updated 5 years ago
- Statistical Bootstrap in Python☆11Feb 28, 2026Updated 2 weeks ago
- Collection of Snowflake Stored Procedures and UDFs that leverage Python☆21Sep 4, 2023Updated 2 years ago
- Triplestore wrapper package for Python.☆12Updated this week
- This project looks at creating a controlled vocabulary for DICOM Pt 6 Data Dictionary with a focus on CS code strings.☆12Jan 9, 2026Updated 2 months ago
- Deep Automodulators☆13Apr 2, 2022Updated 3 years ago
- An experiment, a playground, a sandbox, a toy — LLMs judging code.☆10Jan 28, 2025Updated last year