A pyspark lib to validate data quality
☆19Nov 11, 2022Updated 3 years ago
Alternatives and similar repositories for owl-data-sanitizer
Users that are interested in owl-data-sanitizer are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- This is a fork of the Apache Flink Kinesis connector adding Enhanced Fanout support for Flink 1.8/1.11 on KDA.☆24Mar 1, 2026Updated last month
- Access Amazon's AWS Athena API via reticulate and AWS official Python boto3 module☆10Sep 24, 2018Updated 7 years ago
- ☆18Updated this week
- Multi-stage, config driven, SQL based ETL framework using PySpark☆26Sep 16, 2019Updated 6 years ago
- A python package to create a database on the platform using our moj data warehousing framework☆21Mar 16, 2026Updated last month
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Creating Debian Packages from CRAN Sources☆12Jul 1, 2020Updated 5 years ago
- ☆11Oct 11, 2022Updated 3 years ago
- Utilities for Asyncpg☆15Jan 24, 2019Updated 7 years ago
- ☆10Jun 29, 2021Updated 4 years ago
- ☆18Jun 5, 2023Updated 2 years ago
- ☆12Oct 16, 2023Updated 2 years ago
- A Scalable Data Cleaning Library for PySpark.☆29Apr 4, 2019Updated 7 years ago
- ☆15Dec 10, 2015Updated 10 years ago
- Data validation library for PySpark 3.0.0☆33Nov 11, 2022Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A CLI to manage and monitor permissions in AWS Lake Formation☆25Feb 8, 2023Updated 3 years ago
- Clojure library to explore inversion of control technique - in several senses.☆10May 14, 2024Updated last year
- Optimizing downstream data processing with Amazon Kinesis Data Firehose and Amazon EMR running Apache Spark☆14Apr 14, 2023Updated 3 years ago
- Basic Spark utilities☆13Feb 20, 2025Updated last year
- Guide on how to setup Apache Airflow containers using Docker and IBM Bluemix☆11Feb 19, 2018Updated 8 years ago
- Yopass CLI☆15Mar 30, 2026Updated 2 weeks ago
- This is a mirror of https://github.com/LucaCanali/sparkMeasure - sparkMeasure is a tool for performance troubleshooting of Apache Spark w…☆16Oct 3, 2025Updated 6 months ago
- API REST boilerplate using Spring Boot and Redis as database☆13Dec 26, 2018Updated 7 years ago
- Inline insight into the history of your code☆11Jul 2, 2019Updated 6 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Lua grammar for tree-sitter.☆11Dec 9, 2021Updated 4 years ago
- Due to lack of resources on how to deploy kafka with simple SASL authentication (just username and password) and how to write producer an…☆12Dec 29, 2021Updated 4 years ago
- ☆10Dec 13, 2014Updated 11 years ago
- An LLM-powered chatbot with the added context of the dbt knowledge base.☆39Dec 4, 2024Updated last year
- R package for formatting ggplot2 charts and applying MoJ corporate colours.☆17Nov 7, 2024Updated last year
- Example to create lineage in Atlas with sqoop and spark☆14Apr 5, 2017Updated 9 years ago
- A Vim-based editor for Nim☆15Aug 31, 2023Updated 2 years ago
- Time series forecasting for common inflators and economic indices using the forecast package in R.☆10Feb 28, 2017Updated 9 years ago
- A dynamic data completeness and accuracy library at enterprise scale for Apache Spark☆30Updated this week
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- An open-source synthetic population of individuals and households at a fine geographical level (DA) for Canada for the years 2021, 2023 a…☆10Jan 26, 2023Updated 3 years ago
- A collection of python utility functions☆11Mar 30, 2026Updated 2 weeks ago
- IntelliJ IDEA plugin which converts your Scala code to a Kotlin one☆13Sep 25, 2018Updated 7 years ago
- A Helm Chart for Apache Airflow☆14Oct 17, 2018Updated 7 years ago
- Spark Structured Streaming JDBC Sink☆16Apr 26, 2021Updated 4 years ago
- The privacy-preserving record linkage toolkit: a proof-of-concept public demo of next-gen data linkage techniques.☆16May 22, 2024Updated last year
- ☆17Mar 7, 2023Updated 3 years ago