Nike-Inc / knockoff-factory
A library for generating fake data and populating database tables.
☆34Updated 8 months ago
Related projects ⓘ
Alternatives and complementary repositories for knockoff-factory
- ☆83Updated last year
- Build and deploy a serverless data pipeline on AWS with no effort.☆110Updated last year
- ☆18Updated 4 years ago
- Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.☆166Updated last year
- Demo of Streamlit application with Databricks SQL Endpoint☆33Updated 2 years ago
- A GitHub Action that makes it easy to use Great Expectations to validate your data pipelines in your CI workflows.☆80Updated 6 months ago
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆111Updated 7 months ago
- 🐍💨 Airflow tutorial for PyCon 2019☆85Updated last year
- Docker images that replicate the Amazon SageMaker Notebook instance.☆58Updated 2 years ago
- Create HTML profiling reports from Apache Spark DataFrames☆195Updated 4 years ago
- MLFlow Spark Summit 2019 Presentation☆67Updated 5 years ago
- Fake Pandas / PySpark DataFrame creator☆43Updated 8 months ago
- Converting a zeppelin notebook in single programming language to respective script☆18Updated 4 years ago
- PySpark phonetic and string matching algorithms☆35Updated 8 months ago
- Great Expectations Airflow operator☆159Updated 2 weeks ago
- PySpark data-pipeline testing and CICD☆28Updated 4 years ago
- Shed light on your data layout in order to monitor the health of your Lakehouse tables and identify when data maintenance operations shou…☆10Updated last year
- AWS Big Data Certification☆25Updated last year
- Sample configuration to deploy a modern data platform.☆86Updated 2 years ago
- csv and flat-file sniffer built in Rust.☆42Updated 9 months ago
- Tough and flexible tools for data analysis, transformation, validation and movement.☆136Updated 9 months ago
- Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.☆122Updated 3 years ago
- Data validation library for PySpark 3.0.0☆34Updated 2 years ago
- Data Exploration in PySpark made easy - Pyspark_dist_explore provides methods to get fast insights in your Spark DataFrames.☆100Updated 5 years ago
- Data pipeline with dbt, Airflow, Great Expectations☆158Updated 3 years ago
- ☆25Updated 2 years ago
- A Python package to help Databricks Unity Catalog users to read and query Delta Lake tables with Polars, DuckDb, or PyArrow.☆22Updated 7 months ago
- Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.html☆60Updated last year