Sunil S. Ranka's Weblog

Superior Data Analytics is the antidote to Business Failure

Posts Tagged ‘hdfs fetchImage’

How to read HDFS fsImage file

Posted by sranka on April 2, 2015

During one of the sizing exercise the  ask for server capacity  was more than the actual usage of cluster . Knowing the data and usage, I was not convinced that we should be asking for more memory space. That triggered the thought of

Conceptually FSIMG file is the balancesheet of all the file and their existence and location. If somehow we could read the metadata withing the file and make sence out of it, than it could help us as follow :

  • how to keep the cluster clean.
  • how to manage the space on server by means of knowing file duplication, last access time
  • To Know which are longest running jobs

To more about the files and attributes :

STEP 1: Download the latest fsimage copy.

$ hdfs dfsadmin -fetchImage /tmp

$ ls -ltr /tmp | grep -i fsimage
-rw-r–r– 1 root root 22164 Aug 15 17:27 fsimage_0000000000000004389

$ hdfs oiv -i /tmp/fsimage_0000000000000001386 -o /tmp/fsimage.txt

This would launche a HTTP server which exposes read-only WebHDFS API by default at port “5978”.

For more detail on oiv, you can visit :

http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsImageViewer.html

 

Hope This Helps

Sunil S Ranka

“Superior BI is the antidote to Business Failure”

Posted in Uncategorized | Tagged: , , , , , , | Leave a Comment »