MrPowers / python-parquet-examples
Using the Parquet file format with Python
☆14Updated last year
Related projects ⓘ
Alternatives and complementary repositories for python-parquet-examples
- Fully unit tested utility functions for data engineering. Python 3 only.☆14Updated 3 months ago
- Events about the open source data stack☆13Updated 2 years ago
- AWS Quick Start Team☆18Updated last month
- Herd-UI is a search and discovery tool for business and technical users. Everyone in your organization can use Herd-UI to browse and unde…☆16Updated 2 years ago
- Example Set up For DBT Cloud using Github Integrations☆11Updated 4 years ago
- Pandas helper functions☆29Updated last year
- A template for an AWS Lambda function that triggers Prefect Flow Runs☆20Updated 3 years ago
- Generate Hive CREATE TABLE statements from json data☆10Updated 7 years ago
- Data Catalog for Databases and Data Warehouses☆31Updated 10 months ago
- The sane way of building a data layer in Airflow☆24Updated 4 years ago
- Documentation and resources for deploying JupyterHub on Hadoop☆18Updated 5 years ago
- ☆30Updated last year
- Config files for setting up Multitenant Kubeflow on AWS with spot instances☆10Updated 4 years ago
- Build and deploy a serverless data pipeline on AWS with no effort.☆110Updated last year
- Server that simplifies connecting pandas to a realtime data feed, testing hypothesis and visualizing results in a web browser☆33Updated last year
- Render Jupyter Notebooks With Metaflow Cards☆24Updated last month
- Projects developed by Domino's R&D team☆76Updated 2 years ago
- ☆29Updated 11 months ago
- Customizable GitOps template for Kubeflow on AWS EKS☆10Updated 4 years ago
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise it☆26Updated 8 months ago
- A small Python module containing quick utility functions for standard ETL processes.☆33Updated last week
- Build your feature store with macros right within your dbt repository☆37Updated last year
- A collection of python utility functions☆12Updated 4 months ago
- A tool to learn JSON schema from collection of documents and generate Create table statement for Redshift☆19Updated last month
- ☕⛵WIP PySpark dependency management☆22Updated 6 years ago
- Dask integration for Snowflake☆30Updated last week
- Example templates for the delivery of custom ML solutions to production so you can get started quickly without having to make too many de…☆66Updated 5 months ago
- Make dbt great again! Enables end user to extend dbt to his/her needs☆15Updated this week
- Operational Data Processing Framework developed using AWS Glue and Apache Hudi. This framework is suitable for Data Lake and Modern Data …☆21Updated last year
- This repository auto-configures an Apache Pinot and Superset cluster for analyzing IRA tweets from FiveThirtyEight.☆11Updated 4 years ago