weslleylc / Feature-Store
A containerized approach using Apache Kafka, Spark, Cassandra, Hive, Jupyter, and Docker-compose.
☆14Updated 3 years ago
Alternatives and similar repositories for Feature-Store:
Users that are interested in Feature-Store are comparing it to the libraries listed below
- Trino dbt demo project to mix and load BigQuery data with and in a local PostgreSQL database☆71Updated 3 years ago
- Creates simple data models on Snowflake to report dbt source freshness and tests☆23Updated last year
- Repo that relates to the Medium blog 'Keeping your ML model in shape with Kafka, Airflow' and MLFlow'☆119Updated last year
- ☆53Updated last year
- Example for article Running Spark 3 with standalone Hive Metastore 3.0☆97Updated 2 years ago
- A workspace to experiment with Apache Spark, Livy, and Airflow in a Docker environment.☆39Updated 3 years ago
- A repository of sample code to show data quality checking best practices using Airflow.☆74Updated last year
- One click deploy docker-compose with Kafka, Spark Streaming, Zeppelin UI and Monitoring (Grafana + Kafka Manager)☆119Updated 3 years ago
- JupyterLab extension that enables monitoring launched Apache Spark jobs from within a notebook☆92Updated 2 years ago
- PyConDE & PyData Berlin 2019 Airflow Workshop: Airflow for machine learning pipelines.☆46Updated last year
- (project & tutorial) dag pipeline tests + ci/cd setup☆86Updated 3 years ago
- To provide a deeper understanding of how the modern, open-source data stack consisting of Iceberg, dbt, Trino, and Hive operates within a…☆30Updated 10 months ago
- Orchestrate Spark Jobs from Kubeflow Pipelines and poll for the status.☆52Updated 2 years ago
- Great Expectations Airflow operator☆160Updated 3 months ago
- Learn how to add data validation and documentation to a data pipeline built with dbt and Airflow.☆166Updated last year
- The Internals of Spark on Kubernetes☆70Updated 2 years ago
- Spark on Kubernetes infrastructure Helm charts repo☆200Updated 2 years ago
- End to End example integrating MLFlow and Seldon Core☆51Updated 4 years ago
- Asynchronous actions for PySpark☆47Updated 3 years ago
- ☆43Updated 2 years ago
- Make simple storing test results and visualisation of these in a BI dashboard☆40Updated last month
- Playground for Lakehouse (Iceberg, Hudi, Spark, Flink, Trino, DBT, Airflow, Kafka, Debezium CDC)☆49Updated last year
- A real-time streaming ETL pipeline for streaming and performing sentiment analysis on Twitter data using Apache Kafka, Apache Spark and D…☆30Updated 4 years ago
- Python API for Deequ☆41Updated 4 years ago
- A Python package to submit and manage Apache Spark applications on Kubernetes.☆41Updated last week
- Feast AWS guide using Redshift / Spectrum / DynamoDB to build a credit scoring model☆61Updated 3 years ago
- Delta Lake helper methods. No Spark dependency.☆22Updated 4 months ago
- Pylint plugin for static code analysis on Airflow code☆91Updated 4 years ago
- Airflow training for the crunch conf☆104Updated 6 years ago
- DESIGN AND IMPLEMENTATION OF A MACHINE LEARNING PLATFORM☆12Updated last year