MrPowers / python-parquet-examples
Using the Parquet file format with Python
☆15Updated last year
Alternatives and similar repositories for python-parquet-examples:
Users that are interested in python-parquet-examples are comparing it to the libraries listed below
- Build and deploy a serverless data pipeline on AWS with no effort.☆111Updated 2 years ago
- DataHub on AWS demonstration resources☆10Updated 2 years ago
- A template for an AWS Lambda function that triggers Prefect Flow Runs☆20Updated 3 years ago
- Building 3D Trusted Data Pipelines With Dagster, Dbt, and Duckdb☆20Updated last year
- ☆11Updated 5 months ago
- A collection of python utility functions☆11Updated 10 months ago
- ☆29Updated last year
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise it☆26Updated last year
- Delta reader for the Ray open-source toolkit for building ML applications☆45Updated last year
- Instant search for and access to many datasets in Pyspark.☆34Updated 2 years ago
- Utility functions for dbt projects running on Spark☆33Updated 2 months ago
- Dask integration for Snowflake☆30Updated 5 months ago
- A small Python module containing quick utility functions for standard ETL processes.☆35Updated last week
- Pandas helper functions☆30Updated 2 years ago
- Fully unit tested utility functions for data engineering. Python 3 only.☆16Updated 8 months ago
- JumpSpark - A modern cookiecutter template for pyspark projects with batteries included.☆10Updated last year
- A python library bakeoff for medium sized datasets☆24Updated last year
- Glue VSCode devcontainer setup☆14Updated 2 years ago
- Snowflake Guide: Building a Recommendation Engine Using Snowflake & Amazon SageMaker☆31Updated 3 years ago
- A python package to create a database on the platform using our moj data warehousing framework☆21Updated 8 months ago
- This repository auto-configures an Apache Pinot and Superset cluster for analyzing IRA tweets from FiveThirtyEight.☆11Updated 4 years ago
- dbt / Amazon Redshift Demonstration Project☆34Updated 2 years ago
- Python implementation of Age-Partitioned Bloom Filter with S3 periodic backup support.☆11Updated 3 months ago
- A serverless duckDB deployment at GCP☆39Updated 2 years ago
- Read Delta tables without any Spark☆47Updated last year
- A simple and easy to use Data Quality (DQ) tool built with Python.☆50Updated last year
- Demo for GitHub Universe 2022☆12Updated 2 years ago
- AWS Quick Start Team☆18Updated 7 months ago
- Data-aware orchestration with dagster, dbt, and airbyte☆31Updated 2 years ago
- ☆11Updated 5 months ago