open-datastudio / datastudio
Data science, machine learning tools on the cloud
☆15Updated 4 years ago
Alternatives and similar repositories for datastudio:
Users that are interested in datastudio are comparing it to the libraries listed below
- Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple…☆26Updated 3 years ago
- ☆39Updated 6 years ago
- Instant access to the Spark cluster from anywhere☆16Updated 4 years ago
- A plugin for Airflow that create and manage your DAG with web UI.☆20Updated 7 years ago
- Egeria's Guidance on Governance as well as large media files such as presentations and movies☆104Updated 2 years ago
- A Spark datasource for the HadoopOffice library☆38Updated 2 years ago
- ☆12Updated 2 years ago
- A cloud native data mesh implementation☆12Updated 4 years ago
- DataQuality for BigData☆144Updated last year
- ☆49Updated 5 years ago
- Jupyter extensions for SWAN☆58Updated last week
- Apache DataLab (incubating)☆153Updated last year
- This repository is to help with the Partner Demonstration of the Apache Atlas project.☆30Updated 9 years ago
- Star Schema Benchmark using the Hive / Druid Integration☆30Updated 7 years ago
- Documentation and resources for deploying JupyterHub on Hadoop☆18Updated 5 years ago
- Smart Automation Tool for building modern Data Lakes and Data Pipelines☆121Updated this week
- Rocksdb state storage implementation for Structured Streaming.☆17Updated 4 years ago
- Magic to help Spark pipelines upgrade☆34Updated 6 months ago
- ☆19Updated last year
- A simple Spark-powered ETL framework that just works 🍺☆181Updated 3 weeks ago
- Utility for benchmarking changes in Spark using TPC-DS workloads☆16Updated 3 years ago
- ☆106Updated 2 years ago
- A tool to install, configure and manage Trino installations☆27Updated 3 years ago
- Metamapper is a data discovery and documentation platform for improving how teams understand and interact with their data.☆79Updated last week
- Example for experimenting with how JupyterHub can be configured to work with Kerberos☆33Updated 7 years ago
- Demo application for GRADOOP operators☆23Updated 5 years ago
- Dremio Flight connector. Access Dremio using Arrow flight☆40Updated 4 years ago
- Example for simple Apache Arrow Flight service with Apache Spark and TensorFlow clients☆36Updated 4 years ago
- Soda SQL and Soda Spark have been deprecated and replaced by Soda Core. docs.soda.io/soda-core/overview.html☆61Updated 2 years ago
- The Internals of Spark on Kubernetes☆71Updated 2 years ago