apache/submarine

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/apache/submarine)

apache / submarine

Submarine is Cloud Native Machine Learning Platform.

☆706

Alternatives and similar repositories for submarine

Users that are interested in submarine are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

apache / kyuubi
View on GitHub
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
☆2,352Updated this week
yaooqinn / spark-ranger
View on GitHub
已经合入(apache/incubator-kyuubi) ACL Management for Apache Spark SQL with Apache Ranger.
☆59Nov 11, 2021Updated 4 years ago
apache / yunikorn-core
View on GitHub
Apache YuniKorn Core
☆1,018Updated this week
yaooqinn / spark-authorizer
View on GitHub
A Spark SQL extension which provides SQL Standard Authorization for Apache Spark | This repo is contributed to Apache Kyuubi | 项目已迁移至 Apa…
☆183Apr 6, 2022Updated 4 years ago
apache / ozone
View on GitHub
Scalable, reliable, distributed storage system optimized for data analytics and object store workloads.
☆1,235Updated this week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
tony-framework / TonY
View on GitHub
TonY is a framework to natively run deep learning frameworks on Apache Hadoop.
☆708Oct 14, 2023Updated 2 years ago
apache / ranger
View on GitHub
Apache Ranger - To enable, monitor and manage comprehensive data security across the Hadoop platform and beyond
☆1,065Updated this week
hortonworks-spark / spark-atlas-connector
View on GitHub
A Spark Atlas connector to track data lineage in Apache Atlas
☆268Nov 16, 2022Updated 3 years ago
byzer-org / byzer-lang
View on GitHub
Byzer (former MLSQL): A low-code open-source programming language for data pipeline, analytics and AI.
☆1,835May 29, 2024Updated 2 years ago
insightlake / Ranger-Metastore-Plugin
View on GitHub
Ranger Hive Metastore Plugin
☆18Jul 21, 2023Updated 2 years ago
netease-bigdata / ne-spark-courseware
View on GitHub
NetEase Spark Courses
☆15Sep 4, 2018Updated 7 years ago
apache / hudi
View on GitHub
Upserts, Deletes And Incremental Processing on Big Data.
☆6,193Updated this week
apache / linkis
View on GitHub
Apache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications…
☆3,406Updated this week
NetEase / spark-alarm
View on GitHub
Alerting and monitoring tool for Apache Spark
☆23May 20, 2022Updated 4 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
apache / iceberg
View on GitHub
Apache Iceberg
☆9,062Updated this week
maropu / spark-sql-server
View on GitHub
Yet Another Spark SQL JDBC/ODBC server based on the PostgreSQL V3 protocol
☆34Sep 8, 2022Updated 3 years ago
apache / uniffle
View on GitHub
Uniffle is a high performance, general purpose Remote Shuffle Service.
☆451Updated this week
apache / celeborn
View on GitHub
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
☆1,056Updated this week
apache / atlas
View on GitHub
Apache Atlas - Open Metadata Management and Governance capabilities across the Hadoop platform and beyond
☆2,125Updated this week
ververica / flink-sql-gateway
View on GitHub
☆489Oct 21, 2022Updated 3 years ago
apache / carbondata
View on GitHub
High performance data store solution
☆1,448Jul 4, 2026Updated 2 weeks ago
NetEase / spark-ranger
View on GitHub
ACL Management for Apache Spark SQL with Apache Ranger
☆17Jun 18, 2020Updated 6 years ago
apache / ratis
View on GitHub
Open source Java implementation for Raft consensus protocol.
☆1,464Updated this week
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
linkedin / dr-elephant
View on GitHub
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
☆1,370Aug 22, 2023Updated 2 years ago
linkedin / dynamometer
View on GitHub
A tool for scale and performance testing of HDFS with a specific focus on the NameNode.
☆135Jan 11, 2024Updated 2 years ago
ExpediaGroup / waggle-dance
View on GitHub
Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.
☆288Jun 25, 2026Updated 3 weeks ago
yaooqinn / multi-tenancy-spark
View on GitHub
A Fully HiveServer2-like Multi-tenancy Spark Thrift Server Supporting Impersonation and Multi-SparkContext with Ranger Authorization (GO …
☆10Jul 7, 2022Updated 4 years ago
delta-io / delta
View on GitHub
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Tr…
☆8,917Updated this week
apache / zeppelin
View on GitHub
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
☆6,644Updated this week
apache / gravitino
View on GitHub
World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
☆3,087Updated this week
WeBankFinTech / Scriptis
View on GitHub
Scriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, res…
☆813Dec 11, 2024Updated last year
apache / amoro
View on GitHub
Apache Amoro(incubating) is a Lakehouse management system built on open data lake formats.
☆1,148Updated this week
Managed Database hosting by DigitalOcean • Ad
PostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
WeBankFinTech / DataSphereStudio
View on GitHub
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitizati…
☆3,265Nov 4, 2025Updated 8 months ago
apache / streampark
View on GitHub
Make stream processing easier! Easy-to-use streaming application development framework and operation platform.
☆4,322Updated this week
WeBankFinTech / Prophecis
View on GitHub
Prophecis is a one-stop cloud native machine learning platform.
☆513Mar 28, 2025Updated last year
apache / dolphinscheduler
View on GitHub
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
☆14,381Updated this week
apache / gluten
View on GitHub
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
☆1,576Updated this week
apache / seatunnel
View on GitHub
SeaTunnel is a multimodal, high-performance, distributed, massive data integration tool.
☆9,491Updated this week
Tencent / Firestorm
View on GitHub
Firestorm is a Remote Shuffle Service, and provides the capability for Apache Spark and Apache Hadoop MapReduce applications to store shu…
☆256Apr 7, 2023Updated 3 years ago