chhantyal / parquet-cliLinks
Command line (CLI) tool to inspect Apache Parquet files on the go
☆193Updated last year
Alternatives and similar repositories for parquet-cli
Users that are interested in parquet-cli are comparing it to the libraries listed below
Sorting:
- easy install parquet-tools☆179Updated 10 months ago
- A library that provides useful extensions to Apache Spark and PySpark.☆224Updated 2 months ago
- Apache Avro <-> pandas DataFrame☆137Updated 10 months ago
- Pylint plugin for static code analysis on Airflow code☆95Updated 4 years ago
- Airflow Backfill UI based plugin for existing / new Airflow environment☆65Updated 4 years ago
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes☆63Updated 2 years ago
- A Python client for Apache Livy, enabling use of remote Apache Spark clusters.☆70Updated 3 years ago
- Astronomer Core Docker Images☆107Updated last year
- pytest plugin to run the tests with support of pyspark☆86Updated 2 weeks ago
- dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks☆431Updated 4 months ago
- Great Expectations Airflow operator☆165Updated this week
- Benchmark data warehouses under Fivetran-like conditions☆169Updated 2 years ago
- Performant Redshift data source for Apache Spark☆140Updated last month
- Fast iterative local development and testing of Apache Airflow workflows☆201Updated last month
- Column-wise type annotations for pyspark DataFrames☆78Updated 2 weeks ago
- A Spark UI and Spark History Server alternative with CPU and Memory metrics! Delight is free, cross-platform, and open-source.☆344Updated last year
- The athena adapter plugin for dbt (https://getdbt.com)☆139Updated 2 years ago
- ☆199Updated last year
- ☆70Updated 5 months ago
- PySpark test helper methods with beautiful error messages☆697Updated last month
- Snowflake Data Source for Apache Spark.☆226Updated this week
- The Internals of Delta Lake☆184Updated 4 months ago
- Making DAG construction easier☆265Updated 3 weeks ago
- Turning PySpark Into a Universal DataFrame API☆404Updated this week
- Schema modelling framework for decentralised domain-driven ownership of data.☆254Updated last year
- ☆279Updated this week
- Builds Airflow DAGs from configuration files. Powers all DAGs on the Etsy Data Platform☆261Updated last year
- Apache Airflow integration for dbt☆404Updated last year
- ✨ A Pydantic to PySpark schema library☆93Updated this week
- Read Delta tables without any Spark☆47Updated last year