hi-primus / bumblebee
π A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF, Spark and Vaex)
β141Updated last year
Alternatives and similar repositories for bumblebee:
Users that are interested in bumblebee are comparing it to the libraries listed below
- Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.β123Updated 3 years ago
- Data pipelines from re-usable componentsβ108Updated last year
- Type System for Data Analysis in Pythonβ211Updated last month
- π Notebook storage and publishing workflows for the massesβ203Updated 3 years ago
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withouβ¦β113Updated 11 months ago
- Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).β123Updated 10 months ago
- T4 is now in production as Quilt 3β64Updated 5 years ago
- π Notebook sharing hubβ498Updated last year
- Function dependencies resolution and executionβ71Updated 4 years ago
- Primrose modeling framework for simple production modelsβ33Updated last year
- Magniv Core - A Python-decorator based job orchestration platform. Avoid responsibility handoffs by abstracting infra and DevOps.β78Updated 8 months ago
- Automated Exploratory Data Analysis. Simplifying Data Explorationβ34Updated 4 years ago
- A web frontend for scheduling Jupyter notebook reportsβ252Updated 3 months ago
- Beneath is a serverless real-time data platform β‘οΈβ84Updated 3 years ago
- Server that simplifies connecting pandas to a realtime data feed, testing hypothesis and visualizing results in a web browserβ33Updated last year
- Lossless in-memory compression of pandas DataFrames and Series powered by the visions type system. Up to 10x less RAM needed for the sameβ¦β28Updated 2 years ago
- A frictionless integrated platform for notebookβ85Updated 2 years ago
- Build your feature store with macros right within your dbt repositoryβ38Updated 2 years ago
- β35Updated last month
- Build and deploy a serverless data pipeline on AWS with no effort.β111Updated 2 years ago
- Python stream processing for humansβ185Updated last month
- A small Python module containing quick utility functions for standard ETL processes.β34Updated this week
- A browser user interface for manual labeling of record pairs.β45Updated last year
- manipulate pandas dataframes from the comfort of your browserβ171Updated 3 years ago
- Metamapper is a data discovery and documentation platform for improving how teams understand and interact with their data.β79Updated last week
- plait.py - a fake data modelerβ434Updated 6 years ago
- A toolkit providing a uniform interface for connecting to and extracting data from a wide variety of (potentially remote) data stores (inβ¦β255Updated 9 months ago
- dagster scikit-learn pipeline example.β45Updated 2 years ago
- Repo demonstrating a Dagster pipeline to generate Neo4j Graphβ21Updated 3 years ago
- Embedded MonetDB with a Python frontend and fast Numpy/Pandas supportβ61Updated 5 months ago