ynqa / pandavro
Apache Avro <-> pandas DataFrame
☆137Updated 9 months ago
Alternatives and similar repositories for pandavro:
Users that are interested in pandavro are comparing it to the libraries listed below
- Fast iterative local development and testing of Apache Airflow workflows☆200Updated 2 weeks ago
- pytest plugin to run the tests with support of pyspark☆86Updated last month
- Pylint plugin for static code analysis on Airflow code☆94Updated 4 years ago
- ☆199Updated last year
- Pandas ExtensionDType/Array backed by Apache Arrow☆230Updated 2 years ago
- Apache (Py)Spark type annotations (stub files).☆117Updated 2 years ago
- Fast Avro for Python☆668Updated last week
- Builds Airflow DAGs from configuration files. Powers all DAGs on the Etsy Data Platform☆261Updated last year
- Ray provider for Apache Airflow☆48Updated last year
- Asynchronous actions for PySpark☆47Updated 3 years ago
- Airflow Backfill UI based plugin for existing / new Airflow environment☆65Updated 4 years ago
- Amazon Redshift SQLAlchemy Dialect☆223Updated 10 months ago
- Deploy dask on YARN clusters☆69Updated 9 months ago
- ☆127Updated 4 years ago
- Pythonic file-system interface for Google Cloud Storage☆362Updated this week
- triggering a DAG run multiple times☆88Updated last year
- A Python client for Apache Livy, enabling use of remote Apache Spark clusters.☆70Updated 3 years ago
- Command line (CLI) tool to inspect Apache Parquet files on the go☆190Updated last year
- Coming soon☆61Updated last year
- Data ingestion library for Amundsen to build graph and search index☆205Updated last year
- A pure Python implementation of Apache Spark's RDD and DStream interfaces.☆269Updated 8 months ago
- Airflow declarative DAGs via YAML☆132Updated last year
- [ARCHIVED] The Presto adapter plugin for dbt Core☆33Updated last year
- Metadata service library for Amundsen☆83Updated last month
- Collection of transforms for the Apache beam python SDK.☆89Updated last year
- Airflow Unit Tests and Integration Tests☆258Updated 2 years ago
- Docker images for dask☆240Updated last week
- Monitor Apache Spark from Jupyter Notebook☆172Updated 2 years ago
- A consistent table management library in python☆159Updated last year
- A tool and library for easily deploying applications on Apache YARN☆143Updated last year