hi-primus / bumblebeeLinks
π A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF, Spark and Vaex)
β141Updated 2 years ago
Alternatives and similar repositories for bumblebee
Users that are interested in bumblebee are comparing it to the libraries listed below
Sorting:
- Viewflow is an Airflow-based framework that allows data scientists to create data models without writing Airflow code.β127Updated 4 years ago
- Data pipelines from re-usable componentsβ107Updated 2 months ago
- π Notebook storage and publishing workflows for the massesβ201Updated 4 years ago
- Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).β120Updated 4 months ago
- DataFlows is a simple, intuitive lightweight framework for building data processing flows in python.β222Updated 9 months ago
- Beneath is a serverless real-time data platform β‘οΈβ84Updated 3 years ago
- Convert monolithic Jupyter notebooks π into maintainable Ploomber pipelines. πβ80Updated last year
- plait.py - a fake data modelerβ436Updated 7 years ago
- Tutorials for Fugue - A unified interface for distributed computing. Fugue executes SQL, Python, and Pandas code on Spark and Dask withouβ¦β114Updated 3 months ago
- Curated list of awesome software and resources for Senzing, The First Real-Time AI for Entity Resolution.β66Updated last week
- Type System for Data Analysis in Pythonβ215Updated last year
- manipulate pandas dataframes from the comfort of your browserβ174Updated 4 years ago
- Magniv Core - A Python-decorator based job orchestration platform. Avoid responsibility handoffs by abstracting infra and DevOps.β81Updated last year
- A small Python module containing quick utility functions for standard ETL processes.β37Updated this week
- Python ELT Studio, an application for building ELT (and ETL) data flows.β58Updated 4 years ago
- A JupyterLab extension providing, SQL formatter, auto-completion, syntax highlighting, Spark SQL and Trinoβ93Updated this week
- A web frontend for scheduling Jupyter notebook reportsβ255Updated last year
- A bit of extra usability for sqlalchemy v2.β78Updated last year
- Metamapper is a data discovery and documentation platform for improving how teams understand and interact with their data.β81Updated this week
- MLOps simplified. One-stop AI delivery platform, all the features you need.β106Updated this week
- Codebase for DIVE backend (server, worker, and ORM)β158Updated 3 years ago
- The Data Explorer is nteract's automatic visualization tool.β107Updated 3 years ago
- Helper code to interact with Rasgo via our SDK, PyRasgoβ40Updated 3 years ago
- β27Updated 2 weeks ago
- A browser user interface for manual labeling of record pairs.β48Updated 2 years ago
- Tool to automate data quality checks on data pipelinesβ256Updated 3 years ago
- Woodwork is a Python library that provides robust methods for managing and communicating data typing information.β155Updated 4 months ago
- KNOTS is an intuitive desktop application built to simplify the configuration of Singer pipelinesβ67Updated 3 years ago
- Server that simplifies connecting pandas to a realtime data feed, testing hypothesis and visualizing results in a web browserβ33Updated 2 years ago
- dagster scikit-learn pipeline example.β46Updated 2 years ago