☆23Feb 5, 2024Updated 2 years ago
Alternatives and similar repositories for apache-iceberg-data-exploration
Users that are interested in apache-iceberg-data-exploration are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆15Oct 10, 2025Updated 6 months ago
- A minimal docker compose setup for experimenting with cloud agnostic Lakehouse Architectures Apache Spark with Hive Metastore + Delta Lak…☆34Apr 17, 2024Updated 2 years ago
- A write-audit-publish implementation on a data lake without the JVM☆45Aug 12, 2024Updated last year
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise it☆25Mar 24, 2026Updated last month
- all-in-one-docker-bigdataops is a comprehensive Docker Compose environment that simplifies Big Data operations by bundling Hadoop, Spark,…☆21Feb 9, 2025Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- An end-to-end, containerized data pipeline for near-real-time user event analytics using Kafka, ClickHouse, Airflow, and PySpark. Made to…☆78Sep 12, 2025Updated 7 months ago
- Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testin…☆78Sep 2, 2023Updated 2 years ago
- A data generator for Apache Druid☆12Mar 26, 2025Updated last year
- A Python package that creates fine-grained dbt tasks on Apache Airflow☆84Mar 9, 2026Updated last month
- Repo for learning DBT with Snowflake, featuring projects and models for data transformation and automation☆26Mar 31, 2025Updated last year
- Amazon EMR Serverless and Amazon MSK Serverless Demo☆13Jul 31, 2022Updated 3 years ago
- Tools for Microsoft Fabric☆25Jul 17, 2025Updated 9 months ago
- ☆10Jul 21, 2022Updated 3 years ago
- ☆41Jul 4, 2022Updated 3 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆31Updated this week
- Flink image for Kubernetes that fixes Jobmanage connection issue☆26Jul 31, 2018Updated 7 years ago
- pip installable duckdb extensions published to pypi☆40Mar 29, 2026Updated last month
- ☆16May 29, 2023Updated 2 years ago
- ☆11Mar 7, 2021Updated 5 years ago
- Code to convert static datasets into simulated data streams☆15Apr 6, 2023Updated 3 years ago
- Serverless costs calculator for AWS Lambda☆12Oct 21, 2020Updated 5 years ago
- Data Engineering Projects using Mage.ai as orchestrator☆19Jan 20, 2026Updated 3 months ago
- On-premises ELT Pipeline☆32Jul 10, 2025Updated 9 months ago
- Open source password manager - Proton Pass • AdSecurely store, share, and autofill your credentials with Proton Pass, the end-to-end encrypted password manager trusted by millions.
- Postgresql configured to work as metastore for Hive.☆32Dec 16, 2022Updated 3 years ago
- Demo of orchestrating Airbyte connections with Prefect☆11Mar 3, 2022Updated 4 years ago
- ☆11Jan 17, 2024Updated 2 years ago
- ☆12Aug 13, 2024Updated last year
- Delta-Lake, ETL, Spark, Airflow☆49Oct 9, 2022Updated 3 years ago
- Terraform module for deploying the Prefect Agent on AWS EC2☆13Aug 20, 2025Updated 8 months ago
- Toy Hadoop cluster combining various SQL-on-Hadoop variants☆13Nov 16, 2017Updated 8 years ago
- Generates a tree of an S3 bucket contents☆11Sep 18, 2020Updated 5 years ago
- Scalable Batch and Stream Data Processing☆30Aug 21, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- git push your data stack with Airbyte, Airflow, and dbt - 2022 Airflow Summit☆53May 12, 2023Updated 2 years ago
- Visual Studio Code Server on Azure Web App for Containers☆10Apr 12, 2019Updated 7 years ago
- Automated ML pipeline with Python, Docker, Luigi, SciKit-Learn and Pandas to predict wine quality ratings☆18May 30, 2020Updated 5 years ago
- Flink Example☆17Nov 19, 2023Updated 2 years ago
- Pipeline, warehouse, and visualization tools for investigating the impact of Airbnb short-term rentals on world cities.☆14Jun 9, 2023Updated 2 years ago
- Python manager for spark-submit jobs☆10Jan 6, 2024Updated 2 years ago
- Online Simultaneous Localization and Mapping in ROS☆11Jan 31, 2019Updated 7 years ago