A Docker Compose template that builds a interactive development environment for PySpark with Jupyter Lab, MinIO as object storage, Hive Metastore, Trino and Kafka
☆47Dec 19, 2024Updated last year
Alternatives and similar repositories for lasagna
Users that are interested in lasagna are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- trino + hive + minio with postgres in docker compose☆27Aug 18, 2023Updated 2 years ago
- Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testin…☆78Sep 2, 2023Updated 2 years ago
- Arxiv + Notion Sync☆20May 12, 2025Updated last year
- Forensic Reconstruction of Severely Degraded License Plates, Electronic Imaging, 2019.☆18Apr 27, 2022Updated 4 years ago
- Open source stack lakehouse☆25Mar 2, 2024Updated 2 years ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Repositório no Bootcamp de Engenharia de Dados da Stack Academy.☆44Feb 10, 2023Updated 3 years ago
- Generate DBT tests based on sample data☆39Feb 28, 2024Updated 2 years ago
- used Airflow, Postgres, Kafka, Spark, and Cassandra, and GitHub Actions to establish an end-to-end data pipeline☆31Oct 25, 2023Updated 2 years ago
- Playground for Lakehouse (Iceberg, Hudi, Spark, Flink, Trino, DBT, Airflow, Kafka, Debezium CDC)☆66Sep 23, 2023Updated 2 years ago
- Simple demo using "behave" and "pyspark" libraries to test data transformations in a human-readable way☆10Apr 5, 2019Updated 7 years ago
- 数据治理整体架构☆10Nov 11, 2019Updated 6 years ago
- Docker compose and Google Colab demo to build a CDC with Delta Lake☆15Sep 7, 2022Updated 3 years ago
- 一个基于FastAPI和React的智能体系统,支持多智能体管理、mcp管理、知识库、聊天对话等功能。An intelligent agent system based on FastAPI and React, supporting multi-agent managem…☆24Jan 25, 2026Updated 4 months ago
- Cloud-native Trino (prestosql) + Hive + Minio + Superset☆23Nov 29, 2021Updated 4 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- aprendendo a usar o git do basico ao avançado.☆13Nov 3, 2019Updated 6 years ago
- Community Eventing and Scripting examples☆19Aug 11, 2025Updated 9 months ago
- ☆18Jun 16, 2024Updated last year
- Este é um projeto de exemplo que demonstra um processo de ETL (Extração, Transformação e Carga) de dados usando Python, Polars e AWS Loca…☆15Sep 25, 2023Updated 2 years ago
- Wining solution and its further development for MICCAI 2017 Endoscopic Vision Challenge Angiodysplasia Detection and Localization☆16Jul 3, 2019Updated 6 years ago
- Ingress data from kafka topic into clickhouse table (JSON format)☆24Apr 12, 2018Updated 8 years ago
- LobotoMl is a set of scripts and tools to assess production deployments of ML services☆10May 16, 2022Updated 4 years ago
- A minimal Python wrapper around the App Center REST API☆24May 20, 2026Updated last week
- Pair Trading Analysis & Exercises Toolkit [Jupyter Notebook]☆12Nov 3, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Instalador autonomo do Apache Spark para Sistemas linux: based(Debian,RHEL)☆13Dec 10, 2024Updated last year
- Code for the paper: Kernel Distributionally Robust Optimization☆13Feb 21, 2021Updated 5 years ago
- PetitPotam fork with Kerberos support in the impacket script☆17Aug 3, 2021Updated 4 years ago
- Instructions and code for the workshop "From Big Data to NLP Insights: Unlocking the Power of PySpark and Spark NLP"☆12May 9, 2023Updated 3 years ago
- Test data management tool for any data source, batch or real-time. Generate, validate and clean up data all in one tool.☆81Feb 14, 2026Updated 3 months ago
- ☆23May 18, 2026Updated last week
- Real-time Data Warehouse with Apache Flink & Apache Kafka & Apache Hudi☆120Dec 15, 2023Updated 2 years ago
- An HTTP proxy that naively injects NTLM data for the current user into outgoing requests☆14Nov 14, 2018Updated 7 years ago
- Writeups & Walkthroughs of various CTF challenges and boxes☆14Aug 11, 2021Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- A workspace to experiment with Apache Spark, Livy, and Airflow in a Docker environment.☆38Mar 29, 2021Updated 5 years ago
- Rope collision in cpp☆12Jun 2, 2025Updated 11 months ago
- A project for exploring how Great Expectations can be used to ensure data quality and validate batches within a data pipeline defined in …☆25Aug 30, 2022Updated 3 years ago
- Pytorch directly integrated to the cloud all through Bench AI!☆10Dec 10, 2023Updated 2 years ago
- ☆26Aug 19, 2021Updated 4 years ago
- Particle Syntax Website☆16Apr 12, 2026Updated last month
- Classify images of different kitchenware items☆11Apr 17, 2023Updated 3 years ago