Sunil S. Ranka's Weblog

Superior Data Analytics is the antidote to Business Failure

How to read HDFS fsImage file

Posted by sranka on April 2, 2015

During one of the sizing exercise the  ask for server capacity  was more than the actual usage of cluster . Knowing the data and usage, I was not convinced that we should be asking for more memory space. That triggered the thought of

Conceptually FSIMG file is the balancesheet of all the file and their existence and location. If somehow we could read the metadata withing the file and make sence out of it, than it could help us as follow :

  • how to keep the cluster clean.
  • how to manage the space on server by means of knowing file duplication, last access time
  • To Know which are longest running jobs

To more about the files and attributes :

STEP 1: Download the latest fsimage copy.

$ hdfs dfsadmin -fetchImage /tmp

$ ls -ltr /tmp | grep -i fsimage
-rw-r–r– 1 root root 22164 Aug 15 17:27 fsimage_0000000000000004389

$ hdfs oiv -i /tmp/fsimage_0000000000000001386 -o /tmp/fsimage.txt

This would launche a HTTP server which exposes read-only WebHDFS API by default at port “5978”.

For more detail on oiv, you can visit :


Hope This Helps

Sunil S Ranka

“Superior BI is the antidote to Business Failure”


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: