ananthdurai / python-persistent-apbf
Python implementation of Age-Partitioned Bloom Filter with S3 periodic backup support.
☆11Updated 2 months ago
Alternatives and similar repositories for python-persistent-apbf:
Users that are interested in python-persistent-apbf are comparing it to the libraries listed below
- A framework for simulating e-commerce data and interactions that can be used to build recommendation systems☆10Updated last year
- duckdb-etl-framework☆10Updated 3 months ago
- A text-to-SQL prototype on the northwind sqlite dataset☆12Updated 6 months ago
- Time series forecasting with DuckDB and Evidence☆39Updated 4 months ago
- BoilingData JS client (NodeJS and Browsers)☆19Updated 6 months ago
- This repo contains examples of high throughput ingestion using Apache Spark and Apache Iceberg. These examples cover IoT and CDC scenario…☆22Updated 4 months ago
- A library to use `modal` as a backend for `joblib`.☆28Updated 2 months ago
- Python library to run ML/data pipelines on stateless compute infrastructure (that may be ephemeral or serverless). Please see the documen…☆18Updated last year
- ☆47Updated 2 weeks ago
- Delta reader for the Ray open-source toolkit for building ML applications☆45Updated last year
- Next generation compute platform for the post-modern data stack☆13Updated 3 weeks ago
- SQL query executor on remote DuckDB instance using Apache Arrow Flight RPC through Streamlit Web interface.☆11Updated 4 months ago
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise it☆26Updated last year
- LLM plugin for models hosted by Anyscale Endpoints☆33Updated 11 months ago
- Bytewax Helm charts repository☆12Updated 10 months ago
- Simple Workflow Framework - Hamilton + APScheduler = FlowerPower☆15Updated this week
- ☆28Updated 6 months ago
- The (B)ig (F)unction (T)axonomy is a detailed reference for common compute functions executed by different libraries, databases, and tool…☆16Updated 3 months ago
- Real-time data processing in Python. Tailored for modern AI/ML systems.☆42Updated this week
- Orchestrate Modal and OpenAI workloads with Dagster☆13Updated 3 months ago
- This repository auto-configures an Apache Pinot and Superset cluster for analyzing IRA tweets from FiveThirtyEight.☆11Updated 4 years ago
- Async bulk data ingestion and querying in various document, graph and vector databases via their Python clients☆36Updated last year
- DataHub on AWS demonstration resources☆10Updated 2 years ago
- ☆13Updated 2 years ago
- ☆11Updated 4 months ago
- ☆22Updated 2 weeks ago
- Using the Parquet file format with Python☆15Updated last year
- This repository has a collection of utilities for Glue Crawlers. These utilities come in the form of AWS CloudFormation templates or AWS …☆19Updated 3 years ago
- FUSE-based DuckDB file system 🦆☆27Updated this week
- DataForge helps data teams write functional transformation pipelines by leveraging software engineering principles☆48Updated this week