DAGWorks-Inc / hamilton

Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

☆2,123

Alternatives and similar repositories for hamilton

Users that are interested in hamilton are comparing it to the libraries listed below

Sorting:

stitchfix / hamilton
A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton
☆861Updated last year
DAGWorks-Inc / burr
Build applications that make decisions (chatbots, agents, simulations, etc...). Monitor, trace, persist, and execute on your own infrastr…
☆1,595Updated 3 weeks ago
fugue-project / fugue
A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rew…
☆2,078Updated last month
Eventual-Inc / Daft
Distributed data engine for Python/SQL designed for the cloud, powered by Rust
☆2,808Updated this week
unionai-oss / pandera
A light-weight, flexible, and expressive statistical data testing library
☆3,788Updated this week
TobikoData / sqlmesh
Scalable and efficient data transformation framework - backwards compatible with dbt.
☆2,293Updated this week
bytewax / bytewax
Python Stream Processing
☆1,730Updated last month
dlt-hub / dlt
data load tool (dlt) is an open source Python library that makes data loading easy 🛠️
☆3,567Updated this week
airbnb / chronon
Chronon is a data platform for serving for AI/ML applications.
☆794Updated this week
ipyflow / ipyflow
A reactive Python kernel for Jupyter notebooks.
☆1,223Updated 3 weeks ago
narwhals-dev / narwhals
Lightweight and extensible compatibility layer between dataframe libraries!
☆972Updated this week
sodadata / soda-core
Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
☆2,083Updated this week
featureform / featureform
The Virtual Feature Store. Turn your existing data infrastructure into a feature store.
☆1,891Updated this week
ploomber / ploomber
The fastest ⚡️ way to build data pipelines. Develop iteratively, deploy anywhere. ☁️
☆3,569Updated 7 months ago
meltano / meltano
Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to wr…
☆2,053Updated this week
duckdb / dbt-duckdb
dbt (http://getdbt.com) adapter for DuckDB (http://duckdb.org)
☆1,070Updated last week
ibis-project / ibis
the portable Python dataframe library
☆5,731Updated this week
sfu-db / connector-x
Fastest library to load data from DB to DataFrames in Rust and Python
☆2,245Updated this week
fal-ai / dbt-fal
do more with dbt. dbt-fal helps you run Python alongside dbt, so you can send Slack alerts, detect anomalies and build machine learning m…
☆850Updated last year
JakobGM / patito
A data modelling layer built on top of polars and pydantic
☆429Updated 4 months ago
ploomber / jupysql
Better SQL in Jupyter. 📊
☆775Updated last month
elementary-data / elementary
The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-host…
☆2,059Updated this week
widgetti / solara
A Pure Python, React-style Framework for Scaling Your Jupyter and Web Apps
☆2,032Updated this week
dbt-labs / metricflow
MetricFlow allows you to define, build, and maintain metrics in code.
☆1,207Updated this week
marsupialtail / quokka
Making data lake work for time series
☆1,167Updated 8 months ago
re-data / re-data
re_data - fix data issues before your users & CEO would discover them 😊
☆1,563Updated last year
PrefectHQ / marvin
✨ AI agents that spark joy
☆5,689Updated this week
eakmanrq / sqlframe
Turning PySpark Into a Universal DataFrame API
☆391Updated this week
quixio / quix-streams
Python Streaming DataFrames for Kafka
☆1,369Updated this week
dagster-io / dagster-open-platform
Dagster Labs' open-source data platform, built with Dagster.
☆350Updated last week