datadudes / json2hive
Generate Hive CREATE TABLE statements from json data
☆10Updated 6 years ago
Related projects: ⓘ
- A python client library for the Stitch Import API☆42Updated 8 months ago
- A small Python module containing quick utility functions for standard ETL processes.☆33Updated this week
- A curated list of awesome open source tools and commercial products to catalog, version, and manage data 🚀☆23Updated 2 years ago
- Astronomer Vendor Images☆12Updated this week
- A Singer.io Target for the Stitch Import API☆26Updated 2 weeks ago
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise it☆26Updated 6 months ago
- The sane way of building a data layer in Airflow☆24Updated 4 years ago
- Composable filesystem hooks and operators for Apache Airflow.☆17Updated 3 years ago
- ☆27Updated 2 weeks ago
- Dask integration for Snowflake☆29Updated 2 months ago
- A Pythonic API for Amazon's States Language for defining AWS Step Functions☆8Updated last year
- Plugin for Intake to read from SQL servers☆15Updated last year
- Data Catalog for Databases and Data Warehouses☆31Updated 8 months ago
- Comparison of Airflow on Celery vs Celery☆20Updated 6 years ago
- A python package to create a database on the platform using our moj data warehousing framework☆21Updated 2 weeks ago
- A template repository with all the fundamentals needed to develop and deploy a Python data-processing routine for Prefect pipelines.☆20Updated 2 years ago
- Using the Parquet file format with Python☆14Updated 10 months ago
- A template for an AWS Lambda function that triggers Prefect Flow Runs☆19Updated 3 years ago
- Documentation and resources for deploying JupyterHub on Hadoop☆18Updated 5 years ago
- Utilities for creating ETL pipelines with mara☆36Updated 2 years ago
- This repo demonstrates how to load a sample Parquet formatted file from an AWS S3 Bucket. A python job will then be submitted to a Apach…☆19Updated 8 years ago
- ☆19Updated 4 years ago
- A toolset to streamline running spark python on EMR☆20Updated 7 years ago
- Single command serverless ETL orchestration.☆12Updated 3 weeks ago
- Dask on ECS Fargate☆14Updated 4 years ago
- A collection of python utility functions☆12Updated 2 months ago
- Fully unit tested utility functions for data engineering. Python 3 only.☆14Updated 3 weeks ago