MrPowers / python-parquet-examplesLinks
Using the Parquet file format with Python
☆15Updated last year
Alternatives and similar repositories for python-parquet-examples
Users that are interested in python-parquet-examples are comparing it to the libraries listed below
Sorting:
- DataHub on AWS demonstration resources☆10Updated 2 years ago
- A template for an AWS Lambda function that triggers Prefect Flow Runs☆20Updated 3 years ago
- Pandas helper functions☆31Updated 2 years ago
- Building 3D Trusted Data Pipelines With Dagster, Dbt, and Duckdb☆20Updated last year
- Dask integration for Snowflake☆30Updated 6 months ago
- A collection of python utility functions☆11Updated 11 months ago
- The sane way of building a data layer in Airflow☆24Updated 5 years ago
- Server that simplifies connecting pandas to a realtime data feed, testing hypothesis and visualizing results in a web browser☆33Updated 2 years ago
- A small Python module containing quick utility functions for standard ETL processes.☆35Updated last month
- Code examples for the Introduction to Kubeflow course☆14Updated 4 years ago
- Snowflake Guide: Building a Recommendation Engine Using Snowflake & Amazon SageMaker☆31Updated 3 years ago
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise it☆26Updated last year
- Config files for setting up Multitenant Kubeflow on AWS with spot instances☆10Updated 4 years ago
- Unity Catalog UI☆40Updated 8 months ago
- ☆14Updated 4 years ago
- JumpSpark - A modern cookiecutter template for pyspark projects with batteries included.☆10Updated 2 years ago
- A python package to create a database on the platform using our moj data warehousing framework☆21Updated 8 months ago
- ☆22Updated 9 months ago
- Prefect 2 flows☆11Updated 5 months ago
- Data-aware orchestration with dagster, dbt, and airbyte☆31Updated 2 years ago
- Instant search for and access to many datasets in Pyspark.☆34Updated 2 years ago
- Fully unit tested utility functions for data engineering. Python 3 only.☆17Updated 9 months ago
- ☆30Updated 3 years ago
- Fake Pandas / PySpark DataFrame creator☆47Updated last year
- ☆11Updated 6 months ago
- ☆12Updated last year
- Utility functions for dbt projects running on Spark☆34Updated 3 months ago
- Provide an easy way with Python to protect your data sources by searching its metadata. 🛡️☆17Updated 2 weeks ago
- This repository auto-configures an Apache Pinot and Superset cluster for analyzing IRA tweets from FiveThirtyEight.☆11Updated 4 years ago
- Events about the open source data stack☆13Updated 3 years ago