whylabs / whylogs-protoLinks
Protobuf definition for WhyLogs format
☆14Updated 4 years ago
Alternatives and similar repositories for whylogs-proto
Users that are interested in whylogs-proto are comparing it to the libraries listed below
Sorting:
- Profile and monitor your ML data pipeline end-to-end☆178Updated 3 years ago
- A collection of WhyLogs examples in various languages☆48Updated last year
- A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton☆859Updated 2 years ago
- Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io☆2,150Updated this week
- Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.☆3,479Updated last month
- Project files for the post: Running PySpark Applications on Amazon EMR: Methods for Interacting with PySpark on Amazon Elastic MapReduce.☆38Updated 2 years ago
- Step-by-step guide to using MLflow in a SageMaker MLOps project☆13Updated 3 years ago
- LLMs and Machine Learning done easily☆439Updated 2 weeks ago
- A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rew…☆2,100Updated 4 months ago
- Pandas, Polars, Spark, and Snowpark DataFrame comparison for humans and more!☆588Updated last week
- Python script to clone SQL dashboard from one workspace to another☆16Updated last year
- do more with dbt. dbt-fal helps you run Python alongside dbt, so you can send Slack alerts, detect anomalies and build machine learning m…☆855Updated last year
- ELT Code for your Data Warehouse☆26Updated last year
- Code for the Data Engineering Zoomcamp☆47Updated 2 years ago
- Python API for Deequ☆788Updated 4 months ago
- PySpark data-pipeline testing and CICD☆28Updated 4 years ago
- Repository with sample code and instructions for "Continuous Intelligence" and "Continuous Delivery for Machine Learning: CD4ML" workshop…☆320Updated last year
- Automated data quality suggestions and analysis with Deequ on AWS Glue☆87Updated 2 years ago
- Demo for GitHub Universe 2022☆12Updated 2 years ago
- SQLAlchemy dialect for Databricks☆20Updated 2 years ago
- Ingesting data with Pulumi, AWS lambdas and Snowflake in a scalable, fully replayable manner☆71Updated 3 years ago
- re_data - fix data issues before your users & CEO would discover them 😊☆1,568Updated last year
- Using OpenAI with Databricks SQL for queries in natural language☆22Updated 2 years ago
- Load data from redshift into a pandas DataFrame and vice versa.☆139Updated 2 years ago
- Data ingestion library for Amundsen to build graph and search index☆204Updated last year
- Example task UIs for Amazon SageMaker Ground Truth☆110Updated last month
- The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-host…☆2,124Updated this week
- Move fast from data science prototype to pipeline. Capture, analyze, and transform messy notebooks into data pipelines with just two line…☆668Updated 5 months ago
- This is a guide to PySpark code style presenting common situations and the associated best practices based on the most frequent recurring…☆1,168Updated 10 months ago
- VSCode Dev Container template for AWS Glue jobs development☆19Updated last year