ktrueda / parquet-toolsLinks
easy install parquet-tools
☆180Updated last year
Alternatives and similar repositories for parquet-tools
Users that are interested in parquet-tools are comparing it to the libraries listed below
Sorting:
- Command line (CLI) tool to inspect Apache Parquet files on the go☆193Updated last year
- Write your dbt models using Ibis☆68Updated 3 months ago
- ☆291Updated this week
- Distributed SQL Engine in Python using Dask☆406Updated 10 months ago
- Pylint plugin for static code analysis on Airflow code☆95Updated 4 years ago
- ☆70Updated 6 months ago
- Schema modelling framework for decentralised domain-driven ownership of data.☆253Updated last year
- Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.html☆61Updated 2 years ago
- Python bindings for sqlparser-rs☆191Updated last month
- Run, mock and test fake Snowflake databases locally.☆144Updated last week
- ☆33Updated last year
- Turning PySpark Into a Universal DataFrame API☆413Updated this week
- ☆30Updated 7 months ago
- Pythonic Iceberg REST Catalog☆2Updated 3 weeks ago
- ✨ A Pydantic to PySpark schema library☆98Updated this week
- Enforce Best Practices for all your Airflow DAGs. ⭐☆103Updated this week
- Airflow Providers containing Deferrable Operators & Sensors from Astronomer☆149Updated this week
- Soda Spark is a PySpark library that helps you with testing your data in Spark Dataframes☆64Updated 3 years ago
- makes your sql less bad☆59Updated 5 years ago
- This is the main repository for SDF documentation found at docs.sdf.com, as well as public schemas, benchmarks, and examples☆122Updated 5 months ago
- Fake Pandas / PySpark DataFrame creator☆47Updated last year
- Work with your web service, database, and streaming schemas in a single format.☆345Updated 3 weeks ago
- CLI for DuckDB☆45Updated 3 years ago
- The shared semantic layer definitions that dbt-core and MetricFlow use.☆80Updated this week
- ☆142Updated last month
- ☆55Updated 2 months ago
- The Amazon S3 Tables catalog is a client library that bridges control plane operations provided by S3 Tables to engines like Apache Spark…☆127Updated last month
- Palm CLI - the tool-belt for data teams☆47Updated last year
- [ARCHIVED] The Presto adapter plugin for dbt Core☆33Updated last year
- MongoDB integrations for Apache Arrow. Export MongoDB documents to numpy array, parquet files, and pandas dataframes in one line of code.☆108Updated this week