noirello / pyorc
Python module for Apache ORC file format
☆65Updated 2 months ago
Alternatives and similar repositories for pyorc
Users that are interested in pyorc are comparing it to the libraries listed below
Sorting:
- Python DB-API client for Presto☆238Updated last year
- Multiple node presto cluster on docker container☆124Updated 2 years ago
- A tool and library for easily deploying applications on Apache YARN☆143Updated last year
- A DBAPI and SQLAlchemy dialect for Elasticsearch☆114Updated last year
- Example for article Running Spark 3 with standalone Hive Metastore 3.0☆98Updated 2 years ago
- Python client for Spark Jobserver Rest API☆39Updated 5 years ago
- The Workload Analyzer collects Presto® and Trino workload statistics, and analyzes them☆135Updated last year
- Convert JSON files to Parquet using PyArrow☆97Updated last year
- Docker image for Apache Hive Metastore☆71Updated 2 years ago
- A Spark Connector that reads data from / writes data to Arrow-Flight end-points with Arrow-Flight and Flight-SQL☆40Updated 7 months ago
- A plugin for Apache Airflow that allows you to manage the users that can login☆14Updated 5 years ago
- A tool to install, configure and manage Trino installations☆27Updated 3 years ago
- A Python client for Apache Livy, enabling use of remote Apache Spark clusters.☆70Updated 3 years ago
- Spark SQL index for Parquet tables☆134Updated 4 years ago
- Apache Calcite Adapter for Apache Kudu☆28Updated 7 months ago
- A UDF for Cloudera Impala ( hive get_json_object equivalent )☆31Updated 3 years ago
- Storage connector for Trino☆110Updated last week
- Presto and Minio on Docker Infrastructure☆42Updated 6 years ago
- A wrapper for libhdfs3 to interact with HDFS from Python☆136Updated 4 years ago
- ☆80Updated 3 weeks ago
- Dockerized setup for testing code on realistic hadoop clusters☆27Updated 4 years ago
- This library is an ongoing effort towards bringing the data exchanging ability between Java/Scala and Python. PyJava introduces Apache A…☆47Updated 2 years ago
- Minimal example to run Trino, Minio, and Hive standalone metastore on docker☆52Updated 2 years ago
- Python bindings for FarmHash and CityHash☆38Updated last month
- Yet Another Spark SQL JDBC/ODBC server based on the PostgreSQL V3 protocol☆34Updated 2 years ago
- Spark SQL listener to record lineage information☆28Updated 4 years ago
- SQL CLI for Apache Flink® via docker-compose☆48Updated last year
- Data Pipeline Clientlib provides an interface to tail and publish to data pipeline topics.☆110Updated 2 years ago
- A purely experimental DuckDB Deltalake extension☆95Updated this week
- A schema store service that tracks and manages all the schemas used in the Data Pipeline☆87Updated 4 years ago