A project for exploring how Great Expectations can be used to ensure data quality and validate batches within a data pipeline defined in Airflow.
☆25Aug 30, 2022Updated 3 years ago
Alternatives and similar repositories for GreatEx
Users that are interested in GreatEx are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- A batch processing data pipeline, using AWS resources (S3, EMR, Redshift, EC2, IAM), provisioned via Terraform, and orchestrated from loc…☆24May 14, 2022Updated 4 years ago
- Demo on how to use Prefect with Docker☆27Sep 8, 2022Updated 3 years ago
- ☆18Jan 23, 2026Updated 4 months ago
- End to End Sales Streaming Pipeline (FastAPI, Kafka, Spark, Cassandra, MySQL, Superset)☆10May 26, 2023Updated 3 years ago
- ✨🎨 Dark theme for Visual Studio code based on Aura theme with the "spirit" of dracula☆20Aug 28, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆12Mar 6, 2021Updated 5 years ago
- Data Vault 2.0: Code generation, Vertica, Airflow☆13Nov 20, 2019Updated 6 years ago
- ☆25Jul 9, 2023Updated 2 years ago
- This is a capstone project associated with MLOps Zoomcamp. The end goal of the project is to build an end-to-end machine learning projec…☆13Sep 8, 2022Updated 3 years ago
- Open Source Data Contracts In JSON to UNIFY understanding and efforts efficiently☆16Dec 16, 2022Updated 3 years ago
- Where the Meltano team runs Meltano! Get it???☆31Apr 9, 2025Updated last year
- dbtVault + Greenplum demo☆11Feb 19, 2024Updated 2 years ago
- python package for performing deduplication using flexible text matching and cleaning in pandas dataframe☆24Nov 30, 2020Updated 5 years ago
- Simple finite-state machines in Python☆38Apr 26, 2012Updated 14 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Rust parser for Clickhouse SQL dialect.☆24Feb 16, 2022Updated 4 years ago
- Материалы курса Airflow 101☆15Jun 15, 2020Updated 5 years ago
- ☆16Dec 14, 2021Updated 4 years ago
- With everything I learned from DEZoomcamp from datatalks.club, this project performs a batch processing on AWS for the cycling dataset wh…☆15Jan 4, 2026Updated 5 months ago
- ☆56Jul 30, 2025Updated 10 months ago
- Code to be contributed to the Apache Airflow (incubating) project for ETL workflow management for integrating with the Snowflake Data War…☆26Jul 19, 2017Updated 8 years ago
- How to evaluate the Quality of your Data with Great Expectations and Spark.☆32Mar 29, 2023Updated 3 years ago
- 🦀 Полный roadmap по изучению Rust на русском + большой список ресурсов. Telegram: t.me/rust_code☆166May 17, 2026Updated 3 weeks ago
- Supercharged pandas indexing☆11Mar 28, 2021Updated 5 years ago
- Deploy open-source AI quickly and easily - Special Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Telegram bot for automatic trading on the Tinkoff stock market☆21Apr 26, 2023Updated 3 years ago
- ETL jobs that DoltHub maintained that load public data into DoltHub.☆20Mar 7, 2023Updated 3 years ago
- Open source stack lakehouse☆25Mar 2, 2024Updated 2 years ago
- Netrics - Active Measurements of Internet Performance☆12Sep 14, 2023Updated 2 years ago
- Deployment example for a scikit-learn/lightgbm pipeline☆10Feb 28, 2021Updated 5 years ago
- ☆14Mar 7, 2015Updated 11 years ago
- ☆12Oct 31, 2023Updated 2 years ago
- ☆12Jul 27, 2015Updated 10 years ago
- A small data lake meant for solitary use☆16Jan 28, 2025Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Toolkit for Agile-driven data modeling and data loading using highly Normalized hybrid Model☆23Dec 24, 2024Updated last year
- Explore Chicago ticket data.☆10Dec 8, 2022Updated 3 years ago
- Simple web code editor build with web components libraries☆15Oct 12, 2023Updated 2 years ago
- Code and Word2Vec embeddings of LOINC codes for KDD 2019 DSHealth paper "Evaluation of Embeddings of Laboratory Test Codes for Patients a…☆11Jun 13, 2024Updated 2 years ago
- Remark plugin for selecting and storing code blocks from markdown.☆18Dec 7, 2022Updated 3 years ago
- End-to-end data engineer project☆24Aug 17, 2023Updated 2 years ago
- 📝 A reference for the workshop material☆13Jun 29, 2017Updated 8 years ago