Intel-bigdata/SSM

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/Intel-bigdata/SSM)

Intel-bigdata / SSM

Smart Storage Management for Big Data, a comprehensive hot/cold data optimized solution

☆139

Alternatives and similar repositories for SSM

Users that are interested in SSM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

linkedin / dynamometer
View on GitHub
A tool for scale and performance testing of HDFS with a specific focus on the NameNode.
☆135Jan 11, 2024Updated 2 years ago
linyiqun / open-source-patch
View on GitHub
项目中保留了向开源社区提交过的patch
☆16Oct 22, 2017Updated 8 years ago
apache / gluten
View on GitHub
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
☆1,576Updated this week
apache / celeborn
View on GitHub
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
☆1,056Updated this week
Tencent / Firestorm
View on GitHub
Firestorm is a Remote Shuffle Service, and provides the capability for Apache Spark and Apache Hadoop MapReduce applications to store shu…
☆256Apr 7, 2023Updated 3 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
ExpediaGroup / waggle-dance
View on GitHub
Hive federation service. Enables disparate tables to be concurrently accessed across multiple Hive deployments.
☆288Jun 25, 2026Updated 3 weeks ago
linkedin / dr-elephant
View on GitHub
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
☆1,370Aug 22, 2023Updated 2 years ago
yaooqinn / spark-ranger
View on GitHub
已经合入(apache/incubator-kyuubi) ACL Management for Apache Spark SQL with Apache Ranger.
☆59Nov 11, 2021Updated 4 years ago
apache / kyuubi
View on GitHub
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
☆2,352Updated this week
yaooqinn / spark-authorizer
View on GitHub
A Spark SQL extension which provides SQL Standard Authorization for Apache Spark | This repo is contributed to Apache Kyuubi | 项目已迁移至 Apa…
☆183Apr 6, 2022Updated 4 years ago
CoxAutomotiveDataSolutions / spark-distcp
View on GitHub
A re-implementation of Hadoop DistCP in Apache Spark
☆47Dec 20, 2023Updated 2 years ago
apache / ozone
View on GitHub
Scalable, reliable, distributed storage system optimized for data analytics and object store workloads.
☆1,235Updated this week
paypal / NNAnalytics
View on GitHub
NameNodeAnalytics is a self-help utility for scouting and maintaining the namespace of an HDFS instance.
☆121Nov 25, 2025Updated 7 months ago
hortonworks / hive-testbench
View on GitHub
☆392Jan 25, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
opendataio / hcfsfuse
View on GitHub
A hadoop compatible FUSE use for all.
☆29Jun 18, 2026Updated last month
cubefs / compass
View on GitHub
Compass is a task diagnosis platform for bigdata
☆405Nov 23, 2024Updated last year
tencentyun / hadoop-cos
View on GitHub
hadoop-cos（CosN文件系统）为Apache Hadoop、Spark以及Tez等大数据计算框架集成提供支持，可以像访问HDFS一样读写存储在腾讯云COS上的数据。同时也支持作为Druid等查询与分析引擎的Deep Storage
☆95May 6, 2026Updated 2 months ago
sriramsrao / kfs
View on GitHub
Kosmos Distributed Filesystem
☆29May 3, 2012Updated 14 years ago
asonje / PAT
View on GitHub
Performance Analysis Tool
☆78Nov 25, 2025Updated 7 months ago
alibaba-archive / aliyun-oss-hadoop-fs
View on GitHub
Hadoop filesystem implementation for Aliyun OSS
☆13Feb 14, 2016Updated 10 years ago
yaooqinn / spark-history-cli
View on GitHub
CLI tool for querying Apache Spark History Server REST API
☆28Mar 22, 2026Updated 3 months ago
bytedance / nnproxy
View on GitHub
Scalable NameNode RPC Proxy for HDFS Federation
☆89Apr 19, 2016Updated 10 years ago
hopshadoop / hops
View on GitHub
Hops Hadoop is a distribution of Apache Hadoop with distributed metadata.
☆324Jan 22, 2026Updated 5 months ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
mtyiu / memec
View on GitHub
MemEC: An Erasure-Coding-Based Distributed In-Memory Key-Value Store
☆11Mar 30, 2017Updated 9 years ago
MemVerge / splash
View on GitHub
Splash, a flexible Spark shuffle manager that supports user-defined storage backends for shuffle data storage and exchange
☆131Dec 19, 2024Updated last year
Intel-bigdata / HiBench
View on GitHub
HiBench is a big data benchmark suite.
☆1,485Dec 15, 2025Updated 7 months ago
zrlio / crail
View on GitHub
[Archived] A Fast Multi-tiered Distributed Storage System based on User-Level I/O
☆75Mar 2, 2018Updated 8 years ago
Mellanox / SparkRDMA
View on GitHub
This is archive of SparkRDMA project. The new repository with RDMA shuffle acceleration for Apache Spark is here: https://github.com/Nvid…
☆257May 13, 2019Updated 7 years ago
Aaaaaaron / OLAP-dig-and-dig
View on GitHub
Dig Spark's source code.
☆17Feb 1, 2024Updated 2 years ago
apache / uniffle
View on GitHub
Uniffle is a high performance, general purpose Remote Shuffle Service.
☆451Updated this week
linkedin / transport
View on GitHub
A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Ap…
☆306Jun 29, 2026Updated 3 weeks ago
ceph / cephfs-hadoop
View on GitHub
cephfs-hadoop
☆57Dec 10, 2020Updated 5 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
apache / carbondata
View on GitHub
High performance data store solution
☆1,448Jul 4, 2026Updated 2 weeks ago
Alluxio / alluxio
View on GitHub
Alluxio, data orchestration for analytics and machine learning in the cloud
☆7,212Apr 29, 2025Updated last year
apache / amoro
View on GitHub
Apache Amoro(incubating) is a Lakehouse management system built on open data lake formats.
☆1,148Updated this week
uber / RemoteShuffleService
View on GitHub
Remote shuffle service for Apache Spark to store shuffle data on remote servers.
☆335Sep 29, 2023Updated 2 years ago
cerndb / hdfs-metadata
View on GitHub
Tool for gathering blocks and replicas meta data from HDFS. It also builds a heat map showing how replicas are distributed along disks an…
☆55May 9, 2017Updated 9 years ago
EdurtIO / gcm
View on GitHub
Google Guice component management System!
☆10Sep 24, 2021Updated 4 years ago
senlinzhan / envoy-examples
View on GitHub
Examples of Envoy proxy
☆15Dec 27, 2017Updated 8 years ago