getindata / quickstart-ml-blueprints
Data science project development best practices and state of the art open-source tooling forged into a set of solved ML use cases to serve as blueprints for efficient prototyping.
☆17Updated last year
Alternatives and similar repositories for quickstart-ml-blueprints:
Users that are interested in quickstart-ml-blueprints are comparing it to the libraries listed below
- Receipes of publicly-available Jupyter images☆8Updated 4 months ago
- Workshop "From zero to MLOps: An open source stack to fight spaghetti ML"☆24Updated 7 months ago
- Render Jupyter Notebooks With Metaflow Cards☆25Updated 4 months ago
- Kedro Plugin to support running pipelines on Kubernetes using Airflow.☆28Updated last year
- Delta reader for the Ray open-source toolkit for building ML applications☆45Updated last year
- Kedro Plugin to support running workflows on Kubeflow Pipelines☆53Updated 5 months ago
- This repository contains code to build an MVP search engine with google like interface.☆15Updated 4 years ago
- Demos for Nessie. Nessie provides Git-like capabilities for your Data Lake.☆28Updated this week
- Events about the open source data stack☆13Updated 2 years ago
- A few end to end examples that use data-describe☆16Updated last year
- Personal Finance Project to automatically collect swiss banking transaction into a DWH and visualise it☆26Updated 11 months ago
- Playground site for creating/validating data contracts☆9Updated 5 months ago
- hooqu is a library built on top of Pandas-like Dataframes for defining "unit tests for data". This is a spiritual port of Apache Deequ to…☆26Updated 2 months ago
- Test data management tool for any data source, batch or real-time. Generate, validate and clean up data all in one tool.☆47Updated 3 weeks ago
- Code for data quality with greatexpectations blog☆12Updated 6 months ago
- A Flink applcation that demonstrates reading and writing to/from Apache Kafka with Apache Flink☆20Updated last year
- A write-audit-publish implementation on a data lake without the JVM☆46Updated 6 months ago
- Hands-on workshop with Iceberg, Redpanda, Debezium and Kafka-Connect☆14Updated 4 months ago
- A monorepo of many Rill example projects☆34Updated 2 weeks ago
- FLaNK AI Weekly covering Apache NiFi, Apache Flink, Apache Kafka, Apache Spark, Apache Iceberg, Apache Ozone, Apache Pulsar, and more...☆19Updated this week
- This is a basic Apache Pinot example for ingesting real-time MySQL change logs using Debezium☆27Updated 4 years ago
- A serverless duckDB deployment at GCP☆38Updated 2 years ago
- Examples of user defined functions for Apache Drill☆19Updated 7 years ago
- Sample projects using Ploomber.☆86Updated last year
- A series of Jupyter notebooks that walk you through Machine Learning with Apache Spark ecosystem using Spark MLlib, PyTorch and TensorFlo …☆81Updated last year
- Kedro plugin to support running workflows on Microsoft Azure ML Pipelines☆36Updated 6 months ago
- CLI to create an ER Diagram from DuckDB database files☆82Updated 5 months ago
- A CLI tool to reduce the friction between data scientists by reducing git conflicts removing notebook metadata and gracefully resolving g…☆112Updated last year
- A platform to manage the data product life cycle☆15Updated last week
- Demos of Materialize, the operational data warehouse.☆51Updated 5 months ago