datarootsio / databooksLinks
A CLI tool to reduce the friction between data scientists by reducing git conflicts removing notebook metadata and gracefully resolving git conflicts.
☆113Updated 2 years ago
Alternatives and similar repositories for databooks
Users that are interested in databooks are comparing it to the libraries listed below
Sorting:
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withou…☆114Updated 2 months ago
- Write python locally, execute SQL in your data warehouse☆269Updated 3 years ago
- Typed wrappers over pandas DataFrames with schema validation☆102Updated 2 years ago
- UnionML: the easiest way to build and deploy machine learning microservices☆336Updated 2 years ago
- IbisML is a library for building scalable ML pipelines using Ibis.☆120Updated 6 months ago
- Pandas helper functions☆31Updated 2 years ago
- fsspec-compatible Azure Blob and Data Lake Storage (Gen2) access☆204Updated this week
- Black for Databricks notebooks☆48Updated 7 months ago
- Dask integration for Snowflake☆30Updated 5 months ago
- ☁️ Terraform plugin for machine learning workloads: spot instance recovery & auto-termination | AWS, GCP, Azure, Kubernetes☆294Updated last year
- Fast Data Science, AKA fds, is a CLI for Data Scientists to version control data and code at once, by conveniently wrapping git and dvc☆393Updated last year
- A kedro plugin to use pandera in your kedro projects☆36Updated 4 months ago
- Cloud provider cluster managers for Dask. Supports AWS, Google Cloud Azure and more...☆145Updated 3 months ago
- A toolbox 🧰 for Jupyter notebooks 📙: testing, experiment tracking, debugging, profiling, and more!☆68Updated last year
- 🪴 Nebari - your open source data science platform☆319Updated last week
- Monitor the stability of a Pandas or Spark dataframe ⚙︎☆510Updated 2 weeks ago
- Make your Kedro experience snazzy☆35Updated 3 years ago
- 🧪 📗 Unit test your Jupyter Notebooks the right way☆430Updated last year
- A GitHub Action that makes it easy to use Great Expectations to validate your data pipelines in your CI workflows.☆82Updated last year
- ML pipeline orchestration and model deployments on Kubernetes.☆434Updated 2 years ago
- Machine learning experiment tracking and data versioning with DVC extension for VS Code☆215Updated last week
- Automated Jupyter notebook testing. 📙☆41Updated 2 years ago
- First-party plugins maintained by the Kedro team.☆112Updated last week
- Convert monolithic Jupyter notebooks 📙 into maintainable Ploomber pipelines. 📊☆79Updated last year
- 🏷️ Git Tag Ops. Turn your Git repository into Artifact Registry or Model Registry.☆158Updated last month
- Fake Pandas / PySpark DataFrame creator☆48Updated last year
- A kedro-plugin for integration of mlflow capabilities inside kedro projects (especially machine learning model versioning and packaging)☆230Updated last week
- ☁️ Export Ploomber pipelines to Kubernetes (Argo), Airflow, AWS Batch, SLURM, and Kubeflow.☆45Updated 10 months ago
- 🏬 modelstore is a Python library that allows you to version, export, and save a machine learning model to your filesystem or a cloud sto…☆398Updated last year
- Assessing whether data from database complies with reference information.☆44Updated last week