A write-audit-publish implementation on a data lake without the JVM
☆45Aug 12, 2024Updated last year
Alternatives and similar repositories for no-jvm-wap-with-iceberg
Users that are interested in no-jvm-wap-with-iceberg are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A playground for running duckdb as a stateless query engine over a data lake☆219Jan 10, 2024Updated 2 years ago
- ☆22Feb 5, 2024Updated 2 years ago
- Open-source agentic schema CLI. Optimised for claude code, gemini, codex and co-pilot. Skills included.☆46Mar 20, 2026Updated 3 weeks ago
- How to evaluate the Quality of your Data with Great Expectations and Spark.☆31Mar 29, 2023Updated 3 years ago
- SQL query executor on remote DuckDB instance using Apache Arrow Flight RPC through Streamlit Web interface.☆25Nov 2, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Demo repository to lambda-fy your dbt runs☆11Sep 7, 2023Updated 2 years ago
- A DataFusion-powered Serverless S3 Proxy.☆17Apr 15, 2024Updated 2 years ago
- A "modern" Strava data pipeline fueled by dlt, duckdb, dbt, and evidence.dev☆40May 11, 2025Updated 11 months ago
- Transporter for integrating OpenLineage with OpenMetadata☆18Sep 10, 2025Updated 7 months ago
- DuckDB API Server with Arrow Flight SQL Airport support and concurrent writes/reads (quackpipe)☆122Mar 5, 2025Updated last year
- ☆189May 21, 2025Updated 10 months ago
- 🏟☆28Nov 11, 2020Updated 5 years ago
- Demo repository for running eBPF in GitHub Actions☆23Mar 27, 2025Updated last year
- Unleash the performance potential of your Parquet files.☆49Feb 24, 2026Updated last month
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- A library for parsing images in Mojo☆20Apr 14, 2025Updated last year
- Helm chart for Lakekeeper - a Rust Native Iceberg REST Catalog☆24Apr 10, 2026Updated last week
- Template-based generation of DAG cards from Metaflow classes, inspired by Google cards for machine learning models.☆29Dec 7, 2021Updated 4 years ago
- Anki Overdrive API for Python☆12Oct 21, 2017Updated 8 years ago
- End to end data engineering project☆58Oct 27, 2022Updated 3 years ago
- GAPandas4 is a Python package for querying the Google Analytics Data API for GA4 and displaying the results in a Pandas dataframe.☆34Jul 6, 2022Updated 3 years ago
- Building a poor man's data lake: Exploring the Power of Polars and Delta Lake☆11Dec 6, 2025Updated 4 months ago
- Apache Spark Connect Client for Rust☆117Jun 10, 2025Updated 10 months ago
- ☆13Oct 4, 2023Updated 2 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- repo with resources from Understanding Data with Alex Merced videos☆14Jan 20, 2024Updated 2 years ago
- A high-performance data streaming system using DuckDB and Apache Arrow Flight.☆96Feb 22, 2025Updated last year
- A work-in-progress book on Dask☆12Jul 15, 2023Updated 2 years ago
- Datalog implementation in Scala.☆12Jun 17, 2014Updated 11 years ago
- A flake8 plugin that detects of usage withColumn in a loop or inside reduce☆28Jun 20, 2025Updated 9 months ago
- Example for simple Apache Arrow Flight service with Apache Spark and TensorFlow clients☆37Mar 9, 2021Updated 5 years ago
- This project implements a Lakehouse Medallion Architecture using modern Data Stack tools such as Fivetran, Snowflake and dbt. The fictici…☆15Sep 30, 2024Updated last year
- Composable expressions for data pipelines☆504Updated this week
- Personal project for setting up an open source data warehouse.☆32Jul 11, 2025Updated 9 months ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Markify is an open source command line application written in python which scrapes data from your social media accounts and utilises mark…☆13Aug 8, 2024Updated last year
- reference implementations and use cases done with bauplan☆62Mar 30, 2026Updated 2 weeks ago
- scraping and querying documents for LLMs☆24Oct 6, 2025Updated 6 months ago
- An implementation of Defeasible Logic in Python☆15Sep 2, 2018Updated 7 years ago
- ☆21Updated this week
- Content published on social channels☆17Apr 5, 2025Updated last year
- Data Agents are intelligent assistants built by data engineers to help non-data professionals navigate the organization’s data infrastruc…☆21Apr 14, 2025Updated last year