rsyi / whale
π³ The stupidly simple CLI workspace for your data warehouse.
β726Updated 2 years ago
Alternatives and similar repositories for whale:
Users that are interested in whale are comparing it to the libraries listed below
- re_data - fix data issues before your users & CEO would discover them πβ1,565Updated 9 months ago
- Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.ioβ2,016Updated this week
- Writes the Singer format from Pythonβ548Updated 4 months ago
- This repository is a getting started guide to Singer.β1,287Updated 5 months ago
- Data ingestion library for Amundsen to build graph and search indexβ205Updated 11 months ago
- Python API for Deequβ743Updated 4 months ago
- π Notebook sharing hubβ496Updated last year
- Tool to automate data quality checks on data pipelinesβ254Updated 2 years ago
- Apache Airflow integration for dbtβ401Updated 9 months ago
- Repository for the ActivitySchema spec and supporting materialsβ410Updated 2 years ago
- A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewβ¦β2,040Updated 4 months ago
- Data Pipeline Framework using the singer.io specβ648Updated 2 weeks ago
- A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamiltonβ861Updated last year
- Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code. Tuplβ¦β809Updated 10 months ago
- dbt + Metabase integrationβ493Updated this week
- Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bβ¦β794Updated 2 years ago
- Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.β166Updated last year
- Guides and docs to help you get up and running with Apache Airflow.β805Updated 2 years ago
- The metrics layer for your data. Join us at https://metriql.com/slackβ304Updated last year
- Generate and Visualize Data Lineage from query historyβ319Updated last year
- A Machine Learning System for Data Enrichment.β519Updated last year
- Macros that generate dbt codeβ523Updated 3 weeks ago
- π A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF, Spark and Vaex)β140Updated last year
- MetricFlow allows you to define, build, and maintain metrics in code.β1,180Updated this week
- Build and share data reports in 100% Pythonβ1,388Updated last year
- Schema modelling framework for decentralised domain-driven ownership of data.β250Updated last year
- do more with dbt. dbt-fal helps you run Python alongside dbt, so you can send Slack alerts, detect anomalies and build machine learning mβ¦β854Updated 10 months ago
- Agile Data Preparation Workflows madeΒ easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySparkβ1,494Updated 2 months ago
- Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.β123Updated 3 years ago
- dbt macros to stage external sourcesβ321Updated last month