noirello / pyorc
Python module for Apache ORC file format
☆64Updated last week
Alternatives and similar repositories for pyorc:
Users that are interested in pyorc are comparing it to the libraries listed below
- A Python client for Apache Livy, enabling use of remote Apache Spark clusters.☆70Updated 3 years ago
- A DBAPI and SQLAlchemy dialect for Elasticsearch☆110Updated last year
- Minimal example to run Trino, Minio, and Hive standalone metastore on docker☆49Updated 2 years ago
- Example for article Running Spark 3 with standalone Hive Metastore 3.0☆97Updated 2 years ago
- A plugin for Apache Airflow that allows you to manage the users that can login☆14Updated 5 years ago
- A process that runs in unison with Apache Airflow to control the Scheduler process to ensure High Availability☆233Updated 2 years ago
- A tool and library for easily deploying applications on Apache YARN☆143Updated 11 months ago
- A wrapper for libhdfs3 to interact with HDFS from Python☆136Updated 4 years ago
- A library for Spark DataFrame using MinIO Select API☆97Updated 5 years ago
- Docker image for Apache Hive Metastore☆71Updated last year
- ☆68Updated 2 months ago
- Python DB-API client for Presto☆238Updated last year
- Multiple node presto cluster on docker container☆124Updated 2 years ago
- Python bindings for sqlparser-rs☆176Updated 2 weeks ago
- DataHub Actions is a framework for responding to changes to your DataHub Metadata Graph in real time.☆44Updated this week
- Repository of helm charts for deploying DataHub on a Kubernetes cluster☆174Updated this week
- A library that provides useful extensions to Apache Spark and PySpark.☆214Updated this week
- A tool to install, configure and manage Trino installations☆27Updated 2 years ago
- ☆20Updated last year
- Presto and Minio on Docker Infrastructure☆41Updated 6 years ago
- An Integrated and collaborative cloud environment for building and running Spark applications on PKS/Kubernetes☆82Updated 4 years ago
- Hive for MR3☆38Updated 3 weeks ago
- A client for connecting and running DDLs on hive metastore.☆55Updated 11 months ago
- Convert JSON files to Parquet using PyArrow☆96Updated last year
- REST-like API exposing Airflow data and operations☆61Updated 6 years ago
- A JupyterLab extension providing, SQL formatter, auto-completion, syntax highlighting, Spark SQL and Trino☆85Updated last week
- MongoDB integrations for Apache Arrow. Export MongoDB documents to numpy array, parquet files, and pandas dataframes in one line of code.☆96Updated this week
- The Workload Analyzer collects Presto® and Trino workload statistics, and analyzes them☆135Updated last year
- ☆12Updated last year
- A plugin to Apache Airflow to allow you to run Spark Submit Commands as an Operator☆73Updated 5 years ago