vincentclaes / datajobLinks
Build and deploy a serverless data pipeline on AWS with no effort.
☆111Updated 2 years ago
Alternatives and similar repositories for datajob
Users that are interested in datajob are comparing it to the libraries listed below
Sorting:
- This repo will teach you how to deploy an ML-powered web app to AWS Fargate from start to finish using Streamlit and AWS CDK☆108Updated 4 years ago
- A GitHub Action that makes it easy to use Great Expectations to validate your data pipelines in your CI workflows.☆80Updated last year
- You're one command away from deploying your Streamlit app on AWS Fargate!☆47Updated 4 years ago
- Example templates for the delivery of custom ML solutions to production so you can get started quickly without having to make too many de…☆74Updated last year
- Tools to run Jupyter notebooks as jobs in Amazon SageMaker - ad hoc, on a schedule, or in response to events☆144Updated last year
- Step Functions Data Science SDK for building machine learning (ML) workflows and pipelines on AWS☆292Updated 4 months ago
- Streamlit EDA Dashboard Powered by AWS Cloud☆82Updated 2 months ago
- ☆73Updated last year
- A VS Code Extension to make it easier to manage and develop Spark jobs on EMR☆38Updated 6 months ago
- Spark runtime on AWS Lambda☆109Updated 3 weeks ago
- Fake Pandas / PySpark DataFrame creator☆48Updated last year
- This sample demonstrates how to setup an Amazon SageMaker MLOps end-to-end pipeline for Drift detection☆62Updated last year
- Deploy production-grade Metaflow cloud infrastructure on AWS☆66Updated 3 months ago
- Ingesting data with Pulumi, AWS lambdas and Snowflake in a scalable, fully replayable manner☆71Updated 3 years ago
- This repository contains the dbt-glue adapter☆131Updated this week
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆113Updated last year
- Write python locally, execute SQL in your data warehouse☆270Updated 3 years ago
- A guide for leading a data (engineering) team☆64Updated last year
- Open innovation with 60 minute cloud experiments on AWS☆88Updated last year
- A simple and easy to use Data Quality (DQ) tool built with Python.☆50Updated last year
- A CLI to manage and monitor permissions in AWS Lake Formation☆26Updated 2 years ago
- This repository contains ready-to-use notebook examples for a wide variety of use cases in Amazon EMR Studio.☆52Updated last year
- Playground for using large language models into the Modern Data Stack for entity matching☆108Updated 2 years ago
- A Data Platform built for AWS, powered by Kubernetes.☆148Updated 2 years ago
- Template for a modular, Python-based data science project.☆39Updated last year
- Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.☆126Updated 4 years ago
- Docker images that replicate the Amazon SageMaker Notebook instance.☆58Updated 3 years ago
- Composable filesystem hooks and operators for Apache Airflow.☆17Updated 4 years ago
- Dask integration for Snowflake☆30Updated 3 weeks ago
- ☆30Updated last year