BBVA / data-refineryLinks
Data transformation
☆23Updated 4 years ago
Alternatives and similar repositories for data-refinery
Users that are interested in data-refinery are comparing it to the libraries listed below
Sorting:
- A tool for anomaly detection over streaming data based on sentiment analysis☆30Updated 6 years ago
- Time series based anomaly detector☆82Updated 4 years ago
- real-time data + ML pipeline☆54Updated last week
- Python ELT Studio, an application for building ELT (and ETL) data flows.☆57Updated 3 years ago
- ☆35Updated last month
- 🚨 Simple, self-contained fraud detection system built with Apache Kafka and Python☆87Updated 6 years ago
- Documentation and resources for deploying JupyterHub on Hadoop☆18Updated 5 years ago
- Utilities to showcase OpenMetadata☆27Updated 3 weeks ago
- KnowledgeRepo + JupyterLab☆48Updated 6 months ago
- A small Python module containing quick utility functions for standard ETL processes.☆35Updated last month
- This is a collection of MLflow examples that you can directly run with mlflow command☆31Updated 5 years ago
- Instant search for and access to many datasets in Pyspark.☆34Updated 2 years ago
- A series of workshop modules introducing Feast feature store.☆19Updated 3 years ago
- MLflow App Library☆78Updated 6 years ago
- ☕⛵WIP PySpark dependency management☆22Updated 6 years ago
- Data Exploration in PySpark made easy - Pyspark_dist_explore provides methods to get fast insights in your Spark DataFrames.☆103Updated 5 years ago
- Simple samples for writing ETL transform scripts in Python☆22Updated 3 years ago
- curated list of awesome tools and libraries for specific domains☆47Updated this week
- Some class materials for a data processing course using PySpark☆52Updated 2 years ago
- Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.☆126Updated 3 years ago
- The demo of using Kafka, Spark, Hive, Cassandra, etc by using Docker. It produces the production ready environment for any kinds of big d…☆33Updated 5 years ago
- ElasticSearch implementation of MlFlow tracking store☆18Updated 4 years ago
- Kedro Plugin to support running pipelines on Kubernetes using Airflow.☆28Updated 2 months ago
- Utility Library for Hopsworks. Issues can be posted at https://community.hopsworks.ai☆27Updated 11 months ago
- Data Lineage Tracing Library☆22Updated 3 years ago
- A Scalable Data Cleaning Library for PySpark.☆27Updated 6 years ago
- Code snippets and tools published on the blog at lifearounddata.com☆12Updated 5 years ago
- Dockerized setup for testing code on realistic hadoop clusters☆27Updated 4 years ago
- ☆110Updated 5 months ago
- Ansible roles to deploy Kubernetes, JupyterHub, Jupyter Enterprise Gateway and Spark on Kubernetes cluster☆39Updated 4 years ago