hi-primus / bumblebee
π A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF, Spark and Vaex)
β141Updated last year
Alternatives and similar repositories for bumblebee:
Users that are interested in bumblebee are comparing it to the libraries listed below
- Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.β123Updated 3 years ago
- Data pipelines from re-usable componentsβ108Updated 2 years ago
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withouβ¦β113Updated last year
- π Notebook storage and publishing workflows for the massesβ204Updated 3 years ago
- A web frontend for scheduling Jupyter notebook reportsβ252Updated 4 months ago
- Type System for Data Analysis in Pythonβ211Updated 2 months ago
- Build your feature store with macros right within your dbt repositoryβ38Updated 2 years ago
- π Notebook sharing hubβ499Updated last year
- python library for automated dataset normalizationβ114Updated last year
- Metamapper is a data discovery and documentation platform for improving how teams understand and interact with their data.β79Updated last week
- A bit of extra usability for sqlalchemy v2.β77Updated 10 months ago
- Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).β122Updated 10 months ago
- A library for recording and reading data in notebooks.β288Updated 2 years ago
- Beneath is a serverless real-time data platform β‘οΈβ84Updated 3 years ago
- Primrose modeling framework for simple production modelsβ32Updated last year
- Server that simplifies connecting pandas to a realtime data feed, testing hypothesis and visualizing results in a web browserβ33Updated last year
- Woodwork is a Python library that provides robust methods for managing and communicating data typing information.β151Updated last month
- π¦ Deployment tool for online machine learning modelsβ97Updated 2 years ago
- Lossless in-memory compression of pandas DataFrames and Series powered by the visions type system. Up to 10x less RAM needed for the sameβ¦β28Updated 2 years ago
- Automated Exploratory Data Analysis. Simplifying Data Explorationβ34Updated 4 years ago
- dagster scikit-learn pipeline example.β44Updated 2 years ago
- MLOps simplified. One-stop AI delivery platform, all the features you need.β98Updated this week
- Repo demonstrating a Dagster pipeline to generate Neo4j Graphβ21Updated 3 years ago
- Playground for using large language models into the Modern Data Stack for entity matchingβ107Updated 2 years ago
- Tool to automate data quality checks on data pipelinesβ254Updated 2 years ago
- Low-code Python library to safely use notebooks in production: schedule workflows, generate assets, trigger webhooks, send notifications,β¦β285Updated last month
- A JupyterLab extension providing, SQL formatter, auto-completion, syntax highlighting, Spark SQL and Trinoβ87Updated last week
- Fast iterative local development and testing of Apache Airflow workflowsβ198Updated 3 months ago
- Convert monolithic Jupyter notebooks π into maintainable Ploomber pipelines. πβ78Updated 6 months ago
- The goal of pandas-log is to provide feedback about basic pandas operations. It provides simple wrapper functions for the most common funβ¦β215Updated 3 years ago