sibytes/yetl

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/sibytes/yetl)

sibytes / yetl

Yet Another (Spark) ETL Framework

☆21

Alternatives and similar repositories for yetl

Users that are interested in yetl are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

databrickslabs / sandbox
View on GitHub
Experimental labs projects
☆76Updated this week
delta-io / website
View on GitHub
Delta Lake Website
☆25Jul 8, 2026Updated 2 weeks ago
gbrueckl / Azure.DataFactory.PowerBIMonitor
View on GitHub
This PowerBI template that connects to the Azure Data Factory API to get information about the current status of your Datasets and Slices
☆22Apr 20, 2018Updated 8 years ago
okube-ai / laktory
View on GitHub
A DataOps framework for building a lakehouse.
☆57Jul 14, 2026Updated last week
MartijnVisser / flink-only-sql
View on GitHub
Traditionally, engineers were needed to implement business logic via data pipelines before business users can start using it. Using this …
☆12Jul 16, 2026Updated last week
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
dan1elt0m / unitycatalog-pydantic
View on GitHub
Manage Unity Catalog tables with Pydantic Models
☆10Mar 5, 2025Updated last year
christophermschmidt / monitor
View on GitHub
Open Log Analytics queries and samples on querying different Azure resources and services. Includes sample Power BI reports
☆12Mar 31, 2022Updated 4 years ago
confluentinc / learn-apache-flink-101-exercises
View on GitHub
☆13Dec 5, 2025Updated 7 months ago
holdenk / sparklingml
View on GitHub
Machine Learning Pipeline Stages for Spark (exposed in Scala/Java + Python)
☆16Oct 14, 2019Updated 6 years ago
xdanny / pyspark_types
View on GitHub
Map your python dataclasses to pyspark types
☆10Feb 11, 2024Updated 2 years ago
yugabyte / spring-data-yugabytedb
View on GitHub
Spring Data Module for YugabyteDB.
☆18Aug 30, 2021Updated 4 years ago
EladLeev / schema-registry-statistics
View on GitHub
Schema Registry Statistics Tool
☆24Updated this week
algattik / databricks_test
View on GitHub
A unit test framework for Databricks notebooks
☆12Dec 8, 2020Updated 5 years ago
markprycemaher / Synapse
View on GitHub
Scripts for Azure Synapse SQL Pools (Provisioned) and Query-on-Demand (Serverless)
☆11Nov 2, 2021Updated 4 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
jamesshocking / collapse-spark-dataframe
View on GitHub
Python code that will collapse structured columns separating out the attributes into new columns
☆10Mar 15, 2022Updated 4 years ago
nteract / coffee_boat
View on GitHub
☕⛵WIP PySpark dependency management
☆22Jul 8, 2018Updated 8 years ago
databrickslabs / dbldatagen
View on GitHub
Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used …
☆484Updated this week
spetlr-org / spetlr
View on GitHub
A python SPark ETL libRary (SPETLR) for Databricks. https://discord.gg/p9bzqGybVW
☆24Mar 3, 2026Updated 4 months ago
TaoYang-Cloud / AzureKeyVaultPasswordRepo-PSModule
View on GitHub
☆15Apr 19, 2018Updated 8 years ago
yokawasa / code-server-azure-webapp
View on GitHub
Visual Studio Code Server on Azure Web App for Containers
☆10Apr 12, 2019Updated 7 years ago
DrJohnT / DeployCube
View on GitHub
Publish / Deploy a Tabular or Multidimensional Cube to SSAS or AAS
☆11Jul 14, 2025Updated last year
edwardcapriolo / hive-protobuf
View on GitHub
Protobuf input format and Serde support
☆18Mar 2, 2013Updated 13 years ago
asuiu / SparkORM
View on GitHub
ORM for Apache Spark and DataFrames schema manager
☆16Jun 24, 2024Updated 2 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
delta-io / delta-docs
View on GitHub
Delta Lake Documentation
☆54Jun 19, 2024Updated 2 years ago
jwills / nba_monte_carlo
View on GitHub
The Modern Data Stack in a (Smaller) Box
☆12Jan 28, 2023Updated 3 years ago
cvxgrp / robust_bond_portfolio
View on GitHub
Robust Bond Portfolio Construction via Convex-Concave Saddle Point Optimization
☆14May 13, 2024Updated 2 years ago
dominikhei / Local-Data-LakeHouse
View on GitHub
Sample Data Lakehouse deployed in Docker containers using Apache Iceberg, Minio, Trino and a Hive Metastore. Can be used for local testin…
☆82Sep 2, 2023Updated 2 years ago
apicrafter / datacrafter
View on GitHub
NoSQL extract, transform, load (ETL) toolkit with Python
☆16Jul 17, 2026Updated last week
mrpowers-io / levi
View on GitHub
Delta Lake helper methods. No Spark dependency.
☆22Jan 19, 2026Updated 6 months ago
dotlas / databricks_helpers
View on GitHub
🧱 A collection of supplementary utilities and helper notebooks to perform admin tasks on Databricks
☆57Jul 4, 2025Updated last year
djouallah / Testing_BI_Engine
View on GitHub
TPC-H_SF10
☆53Jan 20, 2025Updated last year
iqmo-org / magic_duckdb
View on GitHub
Jupyter Cell / Line Magics for DuckDB
☆59Apr 10, 2026Updated 3 months ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
gatsby-contrib / gatsby-transformer-ipynb
View on GitHub
Gatsby transformer plugin for jupyter notebooks
☆10Jan 7, 2019Updated 7 years ago
passbolt / passbolt_install_scripts
View on GitHub
Passbolt CE installation scripts
☆19Mar 16, 2021Updated 5 years ago
Klopfe / LSVR
View on GitHub
Python packages for Support Vector Regression with Linear Constraints
☆10Jul 9, 2020Updated 6 years ago
Query-farm / copilot-extension-duckdb
View on GitHub
DuckDB Copilot Extension
☆10Jan 12, 2026Updated 6 months ago
hicknhack-software / KeeShare
View on GitHub
A password sharing plugin for KeePass.
☆17Aug 31, 2019Updated 6 years ago
andrewrosemberg / PortfolioOpt.jl
View on GitHub
Portfolio optimization
☆16Updated this week
ZoinerTejada / mastering-azure-analytics
View on GitHub
Repository for code samples from the book Mastering Azure Analytics
☆25Apr 10, 2017Updated 9 years ago