baligoyem / dataqtorLinks
πYour Data Quality Detector / Gain insight into your data and get it ready for use before you start working with it π‘ππ π
β16Updated 2 years ago
Alternatives and similar repositories for dataqtor
Users that are interested in dataqtor are comparing it to the libraries listed below
Sorting:
- DataHub on AWS demonstration resourcesβ10Updated 2 years ago
- Python ELT Studio, an application for building ELT (and ETL) data flows.β57Updated 3 years ago
- The sane way of building a data layer in Airflowβ24Updated 5 years ago
- Using the Parquet file format with Pythonβ15Updated last year
- A collection of python utility functionsβ11Updated 11 months ago
- β15Updated last year
- This repository contains code to build an MVP search engine with google like interface.β15Updated 2 weeks ago
- dlt-dagster-demoβ11Updated last year
- A few end to end examples that use data-describeβ16Updated 2 years ago
- Skeleton project for Apache Airflow training participants to work on.β16Updated 4 years ago
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise itβ26Updated last year
- Async bulk data ingestion and querying in various document, graph and vector databases via their Python clientsβ36Updated last year
- Code to help generate SQL for stakeholders. Code at https://www.startdataengineering.com/post/data-democratize-llm/β11Updated last year
- β29Updated last year
- β11Updated 6 months ago
- Demo converting streamlit uber nyc rides to use duckdbβ29Updated 2 years ago
- Simple samples for writing ETL transform scripts in Pythonβ23Updated 3 years ago
- This repo demonstrates how to load a sample Parquet formatted file from an AWS S3 Bucket. A python job will then be submitted to a Apachβ¦β19Updated 9 years ago
- Flask based UI for displaying & segmenting a single database tableβ15Updated 3 years ago
- Building 3D Trusted Data Pipelines With Dagster, Dbt, and Duckdbβ21Updated last year
- Documentation and resources for deploying JupyterHub on Hadoopβ19Updated 5 years ago
- A template for an AWS Lambda function that triggers Prefect Flow Runsβ20Updated 3 years ago
- This repository auto-configures an Apache Pinot and Superset cluster for analyzing IRA tweets from FiveThirtyEight.β11Updated 4 years ago
- NoETL (Not Only ETL) is a workflow management system designed to enable AI and machine learning functionality.β11Updated last week
- Example Set up For DBT Cloud using Github Integrationsβ11Updated 5 years ago
- Full stack data engineering tools and infrastructure set-upβ53Updated 4 years ago
- π Run, schedule, and manage your dbt jobs using Kubernetes.β24Updated 6 years ago
- Cookiecutter template for testing Python scikit-learn classifiers.β36Updated last year
- bamboolib - template for creating your own binder notebookβ21Updated 3 years ago
- Demo of Hydraβ18Updated 3 years ago