Automated data quality suggestions and analysis with Deequ on AWS Glue
☆91Dec 29, 2022Updated 3 years ago
Alternatives and similar repositories for amazon-deequ-glue
Users that are interested in amazon-deequ-glue are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Python API for Deequ☆41Nov 10, 2020Updated 5 years ago
- Python API for Deequ☆815Mar 9, 2026Updated last month
- ☆12Oct 16, 2023Updated 2 years ago
- ☆23Oct 3, 2024Updated last year
- Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.☆3,605Apr 1, 2026Updated last week
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Replication utility for AWS Glue Data Catalog☆79Aug 8, 2024Updated last year
- Optimizing downstream data processing with Amazon Kinesis Data Firehose and Amazon EMR running Apache Spark☆14Apr 14, 2023Updated 3 years ago
- Jenga is an experimentation library that allows data science practititioners and researchers to study the effect of common data corruptio…☆43Jun 21, 2023Updated 2 years ago
- Streaming ETL with Apache Flink and Amazon Kinesis Data Analytics☆65Oct 17, 2023Updated 2 years ago
- ☆11Oct 11, 2022Updated 3 years ago
- ☆157Feb 29, 2024Updated 2 years ago
- A tool to automate analytic platform evaluations. Barometer helps customers to get data points needed for service selection/service confi…☆19Jun 3, 2024Updated last year
- An open source development framework to help you build data workflows and modern data architecture on AWS.☆271Feb 9, 2026Updated 2 months ago
- ☆12Aug 9, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- A Singer.io Target for Snowflake☆11Jun 9, 2023Updated 2 years ago
- This repository contains ready-to-use notebook examples for a wide variety of use cases in Amazon EMR Studio.☆53Oct 31, 2023Updated 2 years ago
- cli tool for searching cloudtrail events using fuzzy search☆18Feb 21, 2023Updated 3 years ago
- Operational Data Processing Framework developed using AWS Glue and Apache Hudi. This framework is suitable for Data Lake and Modern Data …☆24Sep 6, 2023Updated 2 years ago
- A code-free AutoML pipeline with AutoGluon, Amazon SageMaker, and AWS Lambda.☆11Aug 5, 2021Updated 4 years ago
- Amazon S3 Find and Forget is a solution to handle data erasure requests from data lakes stored on Amazon S3, for example, pursuant to the…☆245Mar 25, 2026Updated 2 weeks ago
- ☆17Jul 21, 2025Updated 8 months ago
- Amazon Kinesis Data Analytics Flink Starter Kit helps you with the development of Flink Application with Kinesis Stream as a source and A…☆47Aug 30, 2023Updated 2 years ago
- Enterprise-grade, production-hardened, serverless data lake on AWS☆479Oct 1, 2025Updated 6 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Amazon SageMaker MLOps deployment pipeline for A/B Testing of machine learning models.