jchacks / data_cacheLinks
Simple in memory data cache designed for ML applications. Built using Redis and Apache Arrow's Plasma in-memory store
☆11Updated 4 years ago
Alternatives and similar repositories for data_cache
Users that are interested in data_cache are comparing it to the libraries listed below
Sorting:
- Fast, resilient and reproducible data analysis with cached SQL queries☆30Updated 2 years ago
- Unified Distributed Execution☆56Updated 11 months ago
- Shared-memory Python object namespace with Apache Plasma. Built because of Plotly Dash, useful anywhere.☆83Updated 8 months ago
- Derivatives models written with the Tributary data flow library☆24Updated this week
- Convenient pyarrow operations following the Pandas API☆45Updated 3 years ago
- Quickly move data from postgres to numpy or pandas.☆65Updated 2 years ago
- Example for simple Apache Arrow Flight service with Apache Spark and TensorFlow clients☆37Updated 4 years ago
- Python DataFrame with fast insert and appends☆75Updated last month
- Python binding for Khiva library.☆47Updated last year
- Cross Thread Message Pipe☆18Updated 5 years ago
- High performance, editable, stylable datagrids in jupyter and jupyterlab☆114Updated last month
- A Python package that parses sql and converts it to ibis expressions☆55Updated last year
- A lightweight (serverless) native python parallel processing framework based on simple decorators and call graphs.☆103Updated 3 years ago
- Automation tools for Python benchmarking☆19Updated 6 years ago
- Bidirectional communication for the HoloViz ecosystem☆34Updated 3 months ago
- An Python object protocol for projects to interchange data frame-like data without forcing pandas.DataFrame as the intermediary☆15Updated 5 years ago
- Compare DuckDB, Polars and Pandas for generating an artificial dataset of persons and companies☆33Updated 2 years ago
- Pandas Msgpack☆24Updated 3 years ago
- A template for an AWS Lambda function that triggers Prefect Flow Runs☆20Updated 4 years ago
- Python client for RedisAI☆88Updated 2 years ago
- A resizable numpy array on disk (mmap based)☆13Updated 3 years ago
- Set-oriented Operations in Pandas☆24Updated 5 years ago
- Documentation and resources for deploying JupyterHub on Hadoop☆19Updated 6 years ago
- Distributed persistent Task Queue running on Dask☆38Updated 2 years ago
- big data technologies comparisons for cleaning, manipulating and generally wrangling data in purpose of analysis and machine learning.☆65Updated 5 years ago
- A Plotly-Dash component for Chart.js.☆24Updated 5 months ago
- Apache Arrow Flight example☆11Updated 4 years ago
- Streaming API for pandas applied to big datasets☆31Updated last year
- Deploy dask on YARN clusters☆69Updated last year
- Python driver for Timeplus Enterprise or Timeplus Proton☆15Updated 10 months ago