cerndb/hdfs-metadata

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/cerndb/hdfs-metadata)

cerndb / hdfs-metadata

Tool for gathering blocks and replicas meta data from HDFS. It also builds a heat map showing how replicas are distributed along disks and nodes.

☆55

Alternatives and similar repositories for hdfs-metadata

Users that are interested in hdfs-metadata are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

cerndb / Hadoop-Profiler
View on GitHub
Hadoop Profiler, or hprofiler, is a tool which is able to analyze on- and off-CPU workloads on distributed computing environments.
☆24Jul 7, 2016Updated 10 years ago
seanorama / workshop-hadoop-ops
View on GitHub
Workshop for Hadoop Operations Best Practices
☆10Feb 24, 2015Updated 11 years ago
alexholmes / hadoop-utils
View on GitHub
A set of Hadoop utilities to make working with Hadoop a little easier.
☆26Feb 11, 2020Updated 6 years ago
hortonworks / ambari-rest-client
View on GitHub
Groovy client library for Apache Ambari's REST API
☆20Jun 25, 2021Updated 5 years ago
seanorama / masterclass
View on GitHub
Materials for various Hadoop & Nifi related workshops
☆51Mar 20, 2019Updated 7 years ago
Deploy to Railway using AI coding agents - Free Credits Offer • Ad
Use Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
rombert / Maven-Recipe--RPM-Package
View on GitHub
Maven Recipe: RPM Package
☆24May 20, 2010Updated 16 years ago
hortonworks-gallery / iotdemo-service
View on GitHub
Ambari service to deploy/manage Hortonworks IoT demo
☆22Apr 27, 2017Updated 9 years ago
poemp / metadata-gather
View on GitHub
元数据采集,抓取指定目标库的所有表信息
☆12Sep 8, 2022Updated 3 years ago
edwardcapriolo / filecrush
View on GitHub
Remedy small files by combining them into larger ones.
☆196Jul 1, 2022Updated 4 years ago
sentric / hannibal
View on GitHub
Hannibal is tool to help monitor and maintain HBase-Clusters that are configured for manual splitting.
☆172Dec 22, 2017Updated 8 years ago
harelba / hadoop-job-analyzer
View on GitHub
☆29Nov 17, 2014Updated 11 years ago
avast / hdfs-shell
View on GitHub
HDFS Shell is a HDFS manipulation tool to work with functions integrated in Hadoop DFS
☆153Sep 11, 2023Updated 2 years ago
alteryx / sparkGLM
View on GitHub
An R-like GLM package for Apache Spark
☆10Aug 6, 2015Updated 10 years ago
GiraffaFS / giraffa
View on GitHub
Giraffa FileSystem (Slack: giraffa-fs.slack.com)
☆18Mar 8, 2017Updated 9 years ago
Managed Kubernetes at scale on DigitalOcean • Ad
DigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
linyiqun / open-source-patch
View on GitHub
项目中保留了向开源社区提交过的patch
☆16Oct 22, 2017Updated 8 years ago
hazelcast / hazelcast-spark
View on GitHub
Spark Connector for Hazelcast
☆22Jun 9, 2021Updated 5 years ago
jpplayer / amstore-view
View on GitHub
Ambari View for the Ambari Store
☆15Sep 21, 2015Updated 10 years ago
onefoursix / kill-long-running-impala-queries
View on GitHub
☆16Nov 8, 2015Updated 10 years ago
twitter-archive / hdfs-du
View on GitHub
Visualize your HDFS cluster usage
☆228Oct 13, 2020Updated 5 years ago
jshmain / cloudera-search
View on GitHub
☆18Mar 14, 2016Updated 10 years ago
phatak-dev / introduction_to_ml_with_spark
View on GitHub
Code and setup information for Introduction to Machine Learning with Spark
☆12Sep 4, 2015Updated 10 years ago
eBay / oink
View on GitHub
REST based interface for PIG execution
☆25Dec 13, 2021Updated 4 years ago
seanorama / ambari-bootstrap
View on GitHub
Collection of tools for bootstrapping Apache Ambari & deploying clusters
☆83Apr 17, 2019Updated 7 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
sforteln / HdfsBlockFinder
View on GitHub
Allows you to see where(datanodes) that contain a file in HDFS
☆17Mar 16, 2013Updated 13 years ago
lensesio / lenses-topology-example
View on GitHub
An example of streaming microservices with Apache Kafka and Data Flow Topology integration with Lenses Ⓡ DataOps Platform. You can see it…
☆15Nov 29, 2023Updated 2 years ago
cerndb / SparkDLTrigger
View on GitHub
Code and links to the data for the article "Machine Learning Pipelines with Modern Big DataTools for High Energy Physics"
☆31Jun 11, 2024Updated 2 years ago
wilkenstein / redis-mock-java
View on GitHub
An in-memory implementation of redis in Java
☆33Sep 22, 2015Updated 10 years ago
zrlio / albis
View on GitHub
Albis: High-Performance File Format for Big Data Systems
☆21Jul 12, 2018Updated 8 years ago
tresata / spark-columnar
View on GitHub
☆15Mar 4, 2015Updated 11 years ago
devonfw-forge / keywi
View on GitHub
master-data-management system
☆12Jan 7, 2023Updated 3 years ago
ExpediaGroup / shunting-yard
View on GitHub
Shunting Yard is a real-time data replication tool that copies data between Hive Metastores.
☆20Oct 11, 2021Updated 4 years ago
CERT-W / hadoop-attack-library
View on GitHub
A collection of pentest tools and resources targeting Hadoop environments
☆35Mar 2, 2017Updated 9 years ago
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
sahilbhange / hive-sql-slowly-changing-dimension
View on GitHub
Slowly Changing Dimension type 2 using Hive query language using exclusive join technique with ORC Hive tables, partitioned and clustered…
☆16May 11, 2019Updated 7 years ago
yaravind / kafka-connect-jenkins
View on GitHub
Kafka Connect Connector for Jenkins Open Source Continuous Integration Tool
☆31Nov 15, 2022Updated 3 years ago
ParallelAI / SpyGlass
View on GitHub
Cascading and Scalding wrapper for HBase with advanced read features
☆54Updated this week
sunileman / MapReduce-Performance_Testing
View on GitHub
MapReduce performance testing using teragen and terasort
☆19Aug 26, 2021Updated 4 years ago
GerritCodeReview / gerrit-installer
View on GitHub
Gerrit native installation packages for Windows, Linux and Mac OSX - (mirror of https://gerrit.googlesource.com/gerrit-installer)
☆14Jul 15, 2026Updated last week
StumbleUponArchive / asynchbase
View on GitHub
A fully asynchronous, non-blocking, thread-safe, high-performance HBase client.
☆76Jun 7, 2013Updated 13 years ago
asdaraujo / filecrush
View on GitHub
Remedy small files by combining them into larger ones.
☆23Oct 31, 2018Updated 7 years ago