hi-primus / bumblebee
π A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF, Spark and Vaex)
β140Updated last year
Alternatives and similar repositories for bumblebee:
Users that are interested in bumblebee are comparing it to the libraries listed below
- Data pipelines from re-usable componentsβ108Updated last year
- Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.β123Updated 3 years ago
- Type System for Data Analysis in Pythonβ210Updated 2 weeks ago
- π Notebook storage and publishing workflows for the massesβ202Updated 3 years ago
- Metamapper is a data discovery and documentation platform for improving how teams understand and interact with their data.β79Updated this week
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withouβ¦β112Updated 10 months ago
- Primrose modeling framework for simple production modelsβ33Updated 11 months ago
- Beneath is a serverless real-time data platform β‘οΈβ84Updated 3 years ago
- A frictionless integrated platform for notebookβ85Updated 2 years ago
- A web frontend for scheduling Jupyter notebook reportsβ252Updated 2 months ago
- β27Updated 2 weeks ago
- Convert a CSV to a parquet file.β64Updated 2 years ago
- Write python locally, execute SQL in your data warehouseβ270Updated 2 years ago
- Jupyter kernel for SQL databasesβ169Updated 2 months ago
- Build and deploy a serverless data pipeline on AWS with no effort.β111Updated 2 years ago
- Automated Exploratory Data Analysis. Simplifying Data Explorationβ34Updated 4 years ago
- A small Python module containing quick utility functions for standard ETL processes.β34Updated this week
- Fast iterative local development and testing of Apache Airflow workflowsβ196Updated 2 months ago
- Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).β122Updated 8 months ago
- A JupyterLab extension providing, SQL formatter, auto-completion, syntax highlighting, Spark SQL and Trinoβ86Updated 2 weeks ago
- Open Source Self-service Business Intelligence with Version Controlβ322Updated 2 years ago
- KNOTS is an intuitive desktop application built to simplify the configuration of Singer pipelinesβ67Updated 2 years ago
- Woodwork is a Python library that provides robust methods for managing and communicating data typing information.β151Updated last month
- python library for automated dataset normalizationβ113Updated last year
- Python ELT Studio, an application for building ELT (and ETL) data flows.β57Updated 3 years ago
- manipulate pandas dataframes from the comfort of your browserβ171Updated 3 years ago
- The Data Explorer is nteract's automatic visualization tool.β105Updated 2 years ago
- real-time data + ML pipelineβ54Updated 3 weeks ago
- locopy: Loading/Unloading to Redshift and Snowflake using Python.β106Updated this week
- Server that simplifies connecting pandas to a realtime data feed, testing hypothesis and visualizing results in a web browserβ33Updated last year