dask / fastparquet
python implementation of the parquet columnar file format.
☆787Updated last week
Related projects ⓘ
Alternatives and complementary repositories for fastparquet
- python implementation of the parquet columnar file format.☆341Updated 3 years ago
- Intake is a lightweight package for finding, investigating, loading and disseminating data.☆1,013Updated last week
- Turbodbc is a Python module to access relational databases via the Open Database Connectivity (ODBC) interface. The module complies with …☆623Updated last week
- Extended pickling support for Python objects☆1,661Updated last month
- Real-time stream processing for python☆1,244Updated 5 months ago
- A distributed task scheduler for Dask☆1,579Updated this week
- Distributed SQL Engine in Python using Dask☆397Updated 2 months ago
- Scalable Machine Learning with Dask☆902Updated 3 months ago
- Pandas ExtensionDType/Array backed by Apache Arrow☆229Updated last year
- S3 Filesystem☆890Updated last week
- Fast Avro for Python☆645Updated this week
- PyAthena is a Python DB API 2.0 (PEP 249) client for Amazon Athena.☆463Updated 3 months ago
- Fast NumPy array functions written in C☆1,073Updated last month
- Design documents and code for the pandas 2.0 effort.☆306Updated 6 years ago
- Joblib Apache Spark Backend☆242Updated 3 months ago
- A Python package to manage extremely large amounts of data☆1,311Updated 2 weeks ago
- Easy pipelines for pandas DataFrames.☆716Updated 3 weeks ago
- ☆511Updated 2 years ago
- Jupyter magics and kernels for working with remote Spark clusters☆1,328Updated last week
- Robust and reusable Executor for joblib☆538Updated 3 weeks ago
- Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark☆1,481Updated this week
- sqldf for pandas☆1,342Updated 3 months ago
- A specification that python filesystems should adhere to.☆1,037Updated this week
- Numba extension for compiling Pandas data frames, Intel® Scalable Dataframe Compiler☆646Updated last year
- Data Migration for the Blaze Project☆1,004Updated 2 years ago
- Immutable and statically-typeable DataFrames with runtime type and data validation☆442Updated this week
- Apache Avro <-> pandas DataFrame☆135Updated 3 months ago
- Native Kubernetes integration for Dask☆312Updated 2 weeks ago
- Docker images for dask☆232Updated last week
- Pythonic file-system interface for Google Cloud Storage☆345Updated this week