A write-audit-publish implementation on a data lake without the JVM
☆45Aug 12, 2024Updated last year
Alternatives and similar repositories for no-jvm-wap-with-iceberg
Users that are interested in no-jvm-wap-with-iceberg are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A playground for running duckdb as a stateless query engine over a data lake☆220Jan 10, 2024Updated 2 years ago
- Data Engineering Projects using Mage.ai as orchestrator☆19Jan 20, 2026Updated 3 months ago
- ☆23Feb 5, 2024Updated 2 years ago
- Testing various methods of moving Arrow data between processes☆17Mar 29, 2023Updated 3 years ago
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise it☆25Mar 24, 2026Updated last month
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Open-source agentic schema CLI. Optimised for claude code, gemini, codex and co-pilot. Skills included.☆48Mar 20, 2026Updated last month
- How to evaluate the Quality of your Data with Great Expectations and Spark.☆32Mar 29, 2023Updated 3 years ago
- SQL query executor on remote DuckDB instance using Apache Arrow Flight RPC through Streamlit Web interface.☆25Nov 2, 2024Updated last year
- Demo repository to lambda-fy your dbt runs☆11Sep 7, 2023Updated 2 years ago
- A DataFusion-powered Serverless S3 Proxy.☆17Apr 15, 2024Updated 2 years ago
- A "modern" Strava data pipeline fueled by dlt, duckdb, dbt, and evidence.dev☆40May 11, 2025Updated 11 months ago
- Transporter for integrating OpenLineage with OpenMetadata☆18Sep 10, 2025Updated 8 months ago
- DuckDB API Server with Arrow Flight SQL Airport support and concurrent writes/reads (quackpipe)☆123Mar 5, 2025Updated last year
- ☆191May 21, 2025Updated 11 months ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- 🏟☆28Nov 11, 2020Updated 5 years ago
- Malloy model examples and associated datasets☆22Apr 29, 2026Updated last week
- Demo repository for running eBPF in GitHub Actions☆23Mar 27, 2025Updated last year
- A library for parsing images in Mojo☆20Apr 14, 2025Updated last year
- Template-based generation of DAG cards from Metaflow classes, inspired by Google cards for machine learning models.☆29Dec 7, 2021Updated 4 years ago
- End to end data engineering project☆59Oct 27, 2022Updated 3 years ago
- GAPandas4 is a Python package for querying the Google Analytics Data API for GA4 and displaying the results in a Pandas dataframe.☆34Jul 6, 2022Updated 3 years ago
- ☆12Oct 25, 2023Updated 2 years ago
- Apache Spark Connect Client for Rust☆116Jun 10, 2025Updated 11 months ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Trainable embedding transformation for confidence estimation, feature extraction, explainability and conversion from dense to sparse.☆28Jun 9, 2025Updated 11 months ago
- ☆13Oct 4, 2023Updated 2 years ago
- A high-performance data streaming system using DuckDB and Apache Arrow Flight.☆96Feb 22, 2025Updated last year
- A work-in-progress book on Dask☆12Jul 15, 2023Updated 2 years ago
- Datalog implementation in Scala.☆12Jun 17, 2014Updated 11 years ago
- A flake8 plugin that detects of usage withColumn in a loop or inside reduce☆28Jun 20, 2025Updated 10 months ago
- Example for simple Apache Arrow Flight service with Apache Spark and TensorFlow clients☆37Mar 9, 2021Updated 5 years ago
- Composable expressions for data☆507Updated this week
- A demo instance of mage for pulling sample data from a public Google pub/sub topic and transforming with dbt.☆12Jan 5, 2024Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial Offer • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Markify is an open source command line application written in python which scrapes data from your social media accounts and utilises mark…☆13Aug 8, 2024Updated last year
- reference implementations and use cases done with bauplan☆62Mar 30, 2026Updated last month
- An implementation of Defeasible Logic in Python☆15Sep 2, 2018Updated 7 years ago
- A data generator for Apache Druid☆12Mar 26, 2025Updated last year
- Content published on social channels☆17Apr 5, 2025Updated last year
- Repo that will help you explore how to build a hybrid workflow using Apache Airflow and Amazon ECS Anywhere☆11Jul 12, 2022Updated 3 years ago
- Build a data pipeline with Apache Airflow☆11May 7, 2021Updated 5 years ago