chen1649chenli / dataOpsResource
Awesome List for Data Operations
☆24Updated 4 years ago
Alternatives and similar repositories for dataOpsResource:
Users that are interested in dataOpsResource are comparing it to the libraries listed below
- Full stack data engineering tools and infrastructure set-up☆51Updated 4 years ago
- Instant search for and access to many datasets in Pyspark.☆34Updated 2 years ago
- Big Data Demystified meetup and blog examples☆31Updated 8 months ago
- Awesome list for datapipeline☆34Updated 2 years ago
- Data-aware orchestration with dagster, dbt, and airbyte☆31Updated 2 years ago
- Awesome list of dataops products, open source and resources☆24Updated 2 years ago
- Sample configuration to deploy a modern data platform.☆88Updated 3 years ago
- MLflow-tracking server example with Minio and H2O☆18Updated 5 years ago
- Docker compose and Google Colab demo to build a CDC with Delta Lake☆15Updated 2 years ago
- A tutorial that helps Big Data Engineers ramp up faster by getting familiar with PySpark dataframes and functions. It also covers topics …☆20Updated 3 years ago
- Docker template for basic data science packages to interface with Neo4j☆14Updated 3 years ago
- Template for data pipelines, ML workflows, API dev and monitoring☆45Updated last year
- ☆29Updated last year
- A Prefect collection for working with GitLab repositories.☆13Updated 11 months ago
- Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.☆123Updated 3 years ago
- Delux Airflow deployment with Minikube☆10Updated 4 years ago
- Blog post on ETL pipelines with Airflow☆23Updated 4 years ago
- Convert monolithic Jupyter notebooks 📙 into maintainable Ploomber pipelines. 📊☆78Updated 7 months ago
- ☆12Updated 3 years ago
- CLI for data platform☆19Updated last year
- PyCon Talks 2022 by Antoine Toubhans☆23Updated 2 years ago
- Curated list of resources about Apache Airflow☆19Updated 4 years ago
- Awesome Orchest projects, both official and submitted by the community.☆25Updated last year
- Best practices for engineering ML pipelines.☆35Updated 2 years ago
- Repo demonstrating a Dagster pipeline to generate Neo4j Graph☆21Updated 3 years ago
- ☆21Updated 3 years ago
- A very simple "hello world" project for deploying Prefect 2 to a docker container on Google Compute Engine.☆11Updated 2 years ago
- Pandas helper functions☆30Updated 2 years ago
- datascienv is package that helps you to setup your environment in single line of code with all dependency and it is also include pyforest…☆58Updated 3 years ago
- Productivity Utilities for Data Science with Python Notebooks☆6Updated 5 years ago