IntegriChain1 / s3parq
Parquet file management in S3 for Athena / Spectrum / Presto partitioning
☆22Updated 3 months ago
Alternatives and similar repositories for s3parq
Users that are interested in s3parq are comparing it to the libraries listed below
Sorting:
- Data Catalog for Databases and Data Warehouses☆35Updated last year
- A tool to learn JSON schema from collection of documents and generate Create table statement for Redshift☆20Updated 7 months ago
- A collection of python utility functions☆11Updated 10 months ago
- The sane way of building a data layer in Airflow☆24Updated 5 years ago
- A conda-smithy repository for python-duckdb.☆13Updated last month
- a pytest plugin for dbt adapter test suites☆19Updated last year
- The open source version of the Amazon Redshift Cluster Management Guide.☆48Updated last year
- The elegance of Airflow + the power of AWS☆50Updated last year
- Amundsen Gremlin☆21Updated 2 years ago
- Data Profiler for AWS Glue Data Catalog application as described in the AWS Big Data Blog post "Build an automatic data profiling and rep…☆20Updated 5 years ago
- Jupyter Notebook Remote Scheduler for Argo on Kubernetes☆11Updated 5 months ago
- A template for an AWS Lambda function that triggers Prefect Flow Runs☆20Updated 3 years ago
- DataHub on AWS demonstration resources☆10Updated 2 years ago
- PySpark for ETL jobs including lineage to Apache Atlas in one script via code inspection☆18Updated 8 years ago
- A CLI to manage and monitor permissions in AWS Lake Formation☆26Updated 2 years ago
- Examples of various flow deployments for Prefect 1.0 (storage and run configurations)☆35Updated 3 years ago
- Build your feature store with macros right within your dbt repository☆38Updated 2 years ago
- Example Set up For DBT Cloud using Github Integrations☆11Updated 5 years ago
- Delta reader for the Ray open-source toolkit for building ML applications☆46Updated last year
- [ARCHIVED] The Presto adapter plugin for dbt Core☆33Updated last year
- CLI tool to launch Spark jobs on AWS EMR☆67Updated last year
- 💻 CLI for reporting events to Faros platform☆14Updated this week
- A Spark-based data comparison tool at scale which facilitates software development engineers to compare a plethora of pair combinations o…☆51Updated last year
- Dask integration for Snowflake☆30Updated 6 months ago
- Airflow Executor for both AWS ECS & AWS Fargate☆52Updated last year
- CLI for data platform☆19Updated last year
- Derivatives models written with the Tributary data flow library☆23Updated 3 weeks ago
- An experimental Athena extension for DuckDB 🐤☆54Updated 4 months ago
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise it☆26Updated last year
- hooqu is a library built on top of Pandas-like Dataframes for defining "unit tests for data". This is a spiritual port of Apache Deequ to…☆29Updated 5 months ago