ikucan / pykafarrLinks
A high-performance Python Kafka client. Efficiently from Kafka to Pandas and back.
☆40Updated 6 years ago
Alternatives and similar repositories for pykafarr
Users that are interested in pykafarr are comparing it to the libraries listed below
Sorting:
- A Python implementation of Apache Kafka Streams☆310Updated 6 years ago
- Docker images for dask☆242Updated last month
- Derivatives models written with the Tributary data flow library☆23Updated last week
- A Cookiecutter template for creating Faust projects quickly.☆70Updated 2 years ago
- Fast iterative local development and testing of Apache Airflow workflows☆201Updated 2 weeks ago
- Deploy dask on YARN clusters☆69Updated last year
- Streaming reactive and dataflow graphs in Python☆458Updated 3 weeks ago
- Pylint plugin for static code analysis on Airflow code☆95Updated 4 years ago
- Pandas ExtensionDType/Array backed by Apache Arrow☆231Updated 2 years ago
- A consistent table management library in python☆159Updated 2 years ago
- Python Rest Client to interact against Schema Registry confluent server☆179Updated this week
- DBAPI and SQLAlchemy dialect for Databricks Workspace and SQL Analytics clusters☆22Updated 3 years ago
- A tool and library for easily deploying applications on Apache YARN☆144Updated last year
- Native Kubernetes integration for Dask☆323Updated last month
- MongoDB integrations for Apache Arrow. Export MongoDB documents to numpy array, parquet files, and pandas dataframes in one line of code.☆111Updated this week
- Read Delta tables without any Spark☆47Updated last year
- A Python client for Apache Livy, enabling use of remote Apache Spark clusters.☆70Updated 3 years ago
- Distributed SQL Engine in Python using Dask☆407Updated last year
- Apache Avro <-> pandas DataFrame☆138Updated last week
- A kafka streams client library built on confluent-kafka-python☆66Updated last year
- A Python framework for data processing on GCP.☆119Updated 4 months ago
- SQLAlchemy for Dremio via the ODBC and Flight interface.☆30Updated 2 months ago
- A python wrapper for the KSQL REST API.☆158Updated 2 years ago
- Fast, resilient and reproducible data analysis with cached SQL queries☆30Updated 2 years ago
- An extension for Jupyter Lab & Jupyter Notebook to monitor Apache Spark (pyspark) from notebooks☆55Updated 2 months ago
- A web frontend for scheduling Jupyter notebook reports☆253Updated 9 months ago
- Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.☆302Updated last year
- Stream Processing using Polars☆31Updated 2 years ago
- Cloud provider cluster managers for Dask. Supports AWS, Google Cloud Azure and more...☆144Updated 3 weeks ago
- Turbodbc is a Python module to access relational databases via the Open Database Connectivity (ODBC) interface. The module complies with …☆642Updated this week