datarootsio / databooksLinks
A CLI tool to reduce the friction between data scientists by reducing git conflicts removing notebook metadata and gracefully resolving git conflicts.
☆111Updated last year
Alternatives and similar repositories for databooks
Users that are interested in databooks are comparing it to the libraries listed below
Sorting:
- Write python locally, execute SQL in your data warehouse☆270Updated 3 years ago
- A kedro plugin to use pandera in your kedro projects☆36Updated 10 months ago
- Make your Kedro experience snazzy☆35Updated 3 years ago
- UnionML: the easiest way to build and deploy machine learning microservices☆335Updated last year
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆113Updated last year
- A toolbox 🧰 for Jupyter notebooks 📙: testing, experiment tracking, debugging, profiling, and more!☆67Updated 11 months ago
- IbisML is a library for building scalable ML pipelines using Ibis.☆115Updated last month
- 🪴 Nebari - your open source data science platform☆307Updated 2 weeks ago
- Black for Databricks notebooks☆47Updated 2 months ago
- fsspec-compatible Azure Datake and Azure Blob Storage access☆197Updated this week
- A data modelling layer built on top of polars and pydantic☆198Updated 2 years ago
- Fast Data Science, AKA fds, is a CLI for Data Scientists to version control data and code at once, by conveniently wrapping git and dvc☆389Updated last year
- Write your dbt models using Ibis☆70Updated 5 months ago
- Typed wrappers over pandas DataFrames with schema validation☆102Updated last year
- 🧪 📗 Unit test your Jupyter Notebooks the right way☆428Updated last year
- Automated Jupyter notebook testing. 📙☆41Updated last year
- First-party plugins maintained by the Kedro team.☆104Updated this week
- ☁️ Terraform plugin for machine learning workloads: spot instance recovery & auto-termination | AWS, GCP, Azure, Kubernetes☆295Updated 8 months ago
- 💫 PyScaffold extension for data-science projects☆159Updated 2 weeks ago
- Plugins, extensions, case studies, articles, and video tutorials for Kedro☆85Updated 8 months ago
- A GitHub Action that makes it easy to use Great Expectations to validate your data pipelines in your CI workflows.☆81Updated last year
- Cloud provider cluster managers for Dask. Supports AWS, Google Cloud Azure and more...☆144Updated 3 weeks ago
- Possibly the fastest DataFrame-agnostic quality check library in town.☆202Updated this week
- Automatically export Jupyter notebooks to various file formats (.py, .html, and more) on save.☆82Updated last year
- Fake Pandas / PySpark DataFrame creator☆48Updated last year
- Convert monolithic Jupyter notebooks 📙 into maintainable Ploomber pipelines. 📊☆79Updated 11 months ago
- A kedro-plugin for integration of mlflow capabilities inside kedro projects (especially machine learning model versioning and packaging)☆223Updated 2 weeks ago
- Assessing whether data from database complies with reference information.☆43Updated this week
- Dask integration for Snowflake☆30Updated 3 weeks ago
- Templates for your Kedro projects.☆77Updated last month