fuyb1992 / es_pandasLinks
Read, write and update large scale pandas DataFrame with Elasticsearch
☆35Updated 9 months ago
Alternatives and similar repositories for es_pandas
Users that are interested in es_pandas are comparing it to the libraries listed below
Sorting:
- Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch☆686Updated last month
- A DBAPI and SQLAlchemy dialect for Elasticsearch☆117Updated last year
- An Elasticsearch client exposing DataFrame API☆283Updated 2 years ago
- Docker images for dask☆242Updated last month
- A high-performance Python Kafka client. Efficiently from Kafka to Pandas and back.☆40Updated 6 years ago
- Apache Avro <-> pandas DataFrame☆138Updated this week
- Joblib Apache Spark Backend☆249Updated 4 months ago
- Python module for Apache ORC file format☆68Updated 6 months ago
- Timeseries Anomaly detection and Root Cause Analysis on data in SQL data warehouses and databases☆232Updated 3 years ago
- Fast Avro for Python☆679Updated last week
- Use dask to fetch data from Elasticsearch in parallel by sending the request to each shard separatelly.☆20Updated 4 years ago
- Quickly ingest messy CSV and XLS files. Export to clean pandas, SQL, parquet☆197Updated 2 years ago
- Fast iterative local development and testing of Apache Airflow workflows☆202Updated 2 weeks ago
- A Python client for Apache Livy, enabling use of remote Apache Spark clusters.☆70Updated 3 years ago
- ☆75Updated 5 months ago
- Pandas interface for Clickhouse database☆239Updated 4 years ago
- Python module for interacting with geohashes☆168Updated last month
- Fast, resilient and reproducible data analysis with cached SQL queries☆30Updated 2 years ago
- Airflow Backfill UI based plugin for existing / new Airflow environment☆65Updated 4 years ago
- Asynchronous actions for PySpark☆47Updated 3 years ago
- Presto and Minio on Docker Infrastructure☆42Updated 7 years ago
- Distributed SQL Engine in Python using Dask☆407Updated last year
- python automatic data quality check toolkit☆282Updated 4 years ago
- Jupyter Notebooks in S3 - Jupyter Contents Manager implementation☆256Updated last month
- python implementation of the parquet columnar file format.☆843Updated 5 months ago
- 🆕 A machine learning plugin which supports an approximate k-NN search algorithm for Open Distro.☆284Updated 4 years ago
- A python wrapper for the KSQL REST API.☆158Updated 2 years ago
- Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.☆126Updated 4 years ago
- 🚎 Notebook sharing hub☆501Updated last year
- A simple guide to understand Prefect and make it work with your own docker-compose configuration.☆162Updated last year