datarootsio / databooksLinks
A CLI tool to reduce the friction between data scientists by reducing git conflicts removing notebook metadata and gracefully resolving git conflicts.
β112Updated last year
Alternatives and similar repositories for databooks
Users that are interested in databooks are comparing it to the libraries listed below
Sorting:
- A toolbox π§° for Jupyter notebooks π: testing, experiment tracking, debugging, profiling, and more!β67Updated 9 months ago
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withouβ¦β113Updated last year
- Typed wrappers over pandas DataFrames with schema validationβ101Updated last year
- Make your Kedro experience snazzyβ35Updated 3 years ago
- Write python locally, execute SQL in your data warehouseβ270Updated 3 years ago
- Black for Databricks notebooksβ45Updated last month
- fsspec-compatible Azure Datake and Azure Blob Storage accessβ195Updated last week
- Automated Jupyter notebook testing. πβ40Updated last year
- βοΈ Terraform plugin for machine learning workloads: spot instance recovery & auto-termination | AWS, GCP, Azure, Kubernetesβ294Updated 7 months ago
- IbisML is a library for building scalable ML pipelines using Ibis.β110Updated 6 months ago
- A kedro plugin to use pandera in your kedro projectsβ35Updated 8 months ago
- A GitHub Action that makes it easy to use Great Expectations to validate your data pipelines in your CI workflows.β80Updated last year
- UnionML: the easiest way to build and deploy machine learning microservicesβ335Updated last year
- πͺ΄ Nebari - your open source data science platformβ299Updated this week
- Automatically export Jupyter notebooks to various file formats (.py, .html, and more) on save.β78Updated last year
- π§ͺ π Unit test your Jupyter Notebooks the right wayβ428Updated 10 months ago
- Kedro plugin to support running workflows on Microsoft Azure ML Pipelinesβ37Updated 2 weeks ago
- First-party plugins maintained by the Kedro team.β104Updated this week
- Cloud provider cluster managers for Dask. Supports AWS, Google Cloud Azure and more...β143Updated this week
- π« PyScaffold extension for data-science projectsβ160Updated 2 weeks ago
- Convert monolithic Jupyter notebooks π into maintainable Ploomber pipelines. πβ79Updated 9 months ago
- A data modelling layer built on top of polars and pydanticβ196Updated last year
- Templates for your Kedro projects.β76Updated this week
- Possibly the fastest DataFrame-agnostic quality check library in town.β195Updated this week
- Monitor the stability of a Pandas or Spark dataframe βοΈβ503Updated 5 months ago
- Move fast from data science prototype to pipeline. Capture, analyze, and transform messy notebooks into data pipelines with just two lineβ¦β667Updated 4 months ago
- Kedro Plugin to support running workflows on Kubeflow Pipelinesβ54Updated last week
- The DBT of ML, as Aligned describes data dependencies in ML systems, and reduce technical data debtβ60Updated 3 weeks ago
- Sample projects using Ploomber.β86Updated last year
- Pandas helper functionsβ31Updated 2 years ago