How to evaluate the Quality of your Data with Great Expectations and Spark.
☆31Mar 29, 2023Updated 3 years ago
Alternatives and similar repositories for hands-on-great-expectations-with-spark
Users that are interested in hands-on-great-expectations-with-spark are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- In-Session Personalization Workshop for eCommerce, April 2021, and the MICES Workshop in June 2021.☆24Jun 29, 2021Updated 4 years ago
- Demo repository to lambda-fy your dbt runs☆11Sep 7, 2023Updated 2 years ago
- Serve a 1x1 GIF pixel from an AWS lambda-powered endpoint☆13Sep 7, 2017Updated 8 years ago
- A dbt package to run natural language queries☆10Jan 13, 2023Updated 3 years ago
- Unleash the performance potential of your Parquet files.☆52Feb 24, 2026Updated 2 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- ☆11Oct 5, 2022Updated 3 years ago
- A write-audit-publish implementation on a data lake without the JVM☆45Aug 12, 2024Updated last year
- Anki Overdrive API for Python☆12Oct 21, 2017Updated 8 years ago
- ☆22Mar 31, 2022Updated 4 years ago
- ☆12Oct 25, 2023Updated 2 years ago
- Testing various methods of moving Arrow data between processes☆16Mar 29, 2023Updated 3 years ago
- A Node.js tool to examine the correctness of Open Data Metadata and build custom dataset profiles☆12Sep 26, 2023Updated 2 years ago
- An implementation of Defeasible Logic in Python☆15Sep 2, 2018Updated 7 years ago
- reference implementations and use cases done with bauplan☆62Mar 30, 2026Updated last month
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- A fast and simple JavaScript library specifically targeted at collecting search and search related browser events.☆43Nov 20, 2025Updated 5 months ago
- A Python library for anomaly detection☆13Aug 28, 2017Updated 8 years ago
- Slides and notebooks for my tutorial at PyData London 2018☆21Jul 2, 2018Updated 7 years ago
- ☆23Jun 28, 2022Updated 3 years ago
- ☆14Mar 11, 2023Updated 3 years ago
- matching between unstructured and structured data sets☆14Jul 20, 2018Updated 7 years ago
- ☆13Feb 18, 2022Updated 4 years ago
- ETL processing toolset with SQL-like language and GIS capabilities, built on core Spark. Extensible and modular. REPL included☆16Jan 26, 2026Updated 3 months ago
- Demonstration of how dedupe might be used as geocoder☆17Jun 21, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- This example shows how to run Anychart library with the Scala programming language using Akka Http and MySQL.☆11Dec 21, 2017Updated 8 years ago
- This R package provides functions to download survey response and email campaign data from Survey Gizmo, saving the returned data as an R…☆11Jan 6, 2021Updated 5 years ago
- Nyancat in your terminal!☆14May 29, 2018Updated 7 years ago
- Материалы курса Airflow 101☆15Jun 15, 2020Updated 5 years ago
- Official Repository for EvalRS @ KDD 2023: a Rounded Evaluation of Recommender Systems☆30Feb 16, 2024Updated 2 years ago
- This is a simple ipynb file that strips a VTT (transcript file) that is usually produced together with video recordings from tools such a…☆11Oct 23, 2022Updated 3 years ago
- A practical introduction to artificial neural networks.☆28Jun 25, 2017Updated 8 years ago
- PROVED (PRocess mining OVer uncErtain Data) is a library of functionalities to perform process mining on uncertain event data.☆12Apr 27, 2026Updated last week
- ☆29Jan 10, 2024Updated 2 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Spark (PySpark) script that applies dynamic time warping to Energy usage data (using the python fastdtw package)☆15Oct 22, 2016Updated 9 years ago
- Template-based generation of DAG cards from Metaflow classes, inspired by Google cards for machine learning models.☆29Dec 7, 2021Updated 4 years ago
- ☆17Sep 12, 2020Updated 5 years ago
- ☆16Feb 12, 2025Updated last year
- htmlwidgets for rCharts + dimple☆27Oct 14, 2015Updated 10 years ago
- ☆17Jun 16, 2020Updated 5 years ago
- A PaaS End-to-End ML Setup with Metaflow, Serverless and SageMaker.☆37Feb 10, 2021Updated 5 years ago