volga-project / volga
Real-time data processing/feature engineering in Python. Tailored for modern AI/ML systems.
β57Updated this week
Alternatives and similar repositories for volga:
Users that are interested in volga are comparing it to the libraries listed below
- Apache DataFusion Rayβ189Updated last month
- Open, Multi-modal Catalog for Data & AI, written in Rustβ79Updated 7 months ago
- π Self-contained demo using Redpanda, Materialize, River, Redis, and Streamlit to predict taxi trip durationsβ46Updated 2 years ago
- Delta reader for the Ray open-source toolkit for building ML applicationsβ46Updated last year
- Snowflake bring-your-own-cloud option. Run Snowflake as a microservice on your own computeβ59Updated this week
- Simple Workflow Framework - Hamilton + APScheduler = FlowerPowerβ18Updated this week
- A leightweight UI for Lakekeeperβ11Updated this week
- DB API 2 interface for Flight SQL with SQLAlchemy extras.β39Updated last month
- CLI tool to bulk migrate the tables from one catalog another without a data copyβ77Updated last month
- β91Updated last week
- Python stream processing for analyticsβ38Updated last month
- deferred, multi-engine computational frameworkβ253Updated this week
- IbisML is a library for building scalable ML pipelines using Ibis.β108Updated 4 months ago
- Arrow, pydantic styleβ82Updated 2 years ago
- Python library to run ML/data pipelines on stateless compute infrastructure (that may be ephemeral or serverless). Please see the documenβ¦β18Updated last year
- A platform and cloud-based service for data sharing based on the Delta Sharing protocol.β21Updated 11 months ago
- Open Benchmarks for Evaluating the Performance of Feature Storesβ35Updated last year
- Apache Arrow Ballista Python bindingsβ37Updated last year
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withouβ¦β113Updated last year
- The native Rust implementation for Apache Hudi, with Python API bindings.β211Updated this week
- Journeys between the two worlds of Python π and Rust π¦β40Updated this week
- Real-time deduplication and temporal joins for streaming dataβ46Updated this week
- DataFusion TableProviders for reading data from other systemsβ112Updated this week
- A Spark Connector that reads data from / writes data to Arrow-Flight end-points with Arrow-Flight and Flight-SQLβ39Updated 7 months ago
- Apache Spark Connect Client for Rustβ107Updated 2 weeks ago
- Python implementation of Age-Partitioned Bloom Filter with S3 periodic backup support.β11Updated 3 months ago
- View parquet files onlineβ152Updated this week
- β11Updated 2 years ago
- SQL query executor on remote DuckDB instance using Apache Arrow Flight RPC through Streamlit Web interface.β14Updated 6 months ago
- A high-performance data streaming system using DuckDB and Apache Arrow Flight.β77Updated 2 months ago