Sunil S. Ranka's Weblog

Superior BI is the antidote to Business Failure

How To Run Hadoop Benchmarking TestDFSIO on Cloudera Clusters

Posted by sranka on October 9, 2013

Hi All

Out of the box hadoop provides a benchmarking mechanism for your cluster. While doing the same on Cloudera cluster, it was a fun ride, hence thought will share the same to reduce the pain and increase the fun.

Before you begin anything, set the HADOOP_HOME.The below command would work for RHEL.

HADOOP_HOME=/opt/cloudera/parcels/CDH/lib/hadoop/

For CDH “TestDFSIO” resides in — hadoop-mapreduce-client-jobclient-<version>-cdh<version>-tests.jar — in “lib/hadoop-mapreduce/” under “Cloudera Home Directory” in my case :

/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-2.0.0-cdh4.3.0-tests.jar

You will need to run read and write Test Benchmark as below :

hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-2.0.0-cdh4.3.0-tests.jar TestDFSIO -write -nrFiles 10 -fileSize 1000

hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-2.0.0-cdh4.3.0-tests.jar TestDFSIO -read -nrFiles 10 -fileSize 1000

Once you run the test you will see “TestDFSIO_results.log”  file in the same directory. The content of the file would look below :

----- TestDFSIO ----- : write
 Date & time: Wed Oct 09 14:56:14 PDT 2013
 Number of files: 10
Total MBytes processed: 10000.0
 Throughput mb/sec: 5.382930941302368
Average IO rate mb/sec: 5.390388488769531
 IO rate std deviation: 0.20763769922620628
 Test exec time sec: 211.457

----- TestDFSIO ----- : read
 Date & time: Wed Oct 09 14:57:47 PDT 2013
 Number of files: 10
Total MBytes processed: 10000.0
 Throughput mb/sec: 48.88230607167124
Average IO rate mb/sec: 49.50707244873047
 IO rate std deviation: 5.8465670196729596
 Test exec time sec: 39.954

Based on the numbers aboove, below would be the read and write Throughput across the cluster.

Total Read Throughput Across Clusters (Number of files * Throughput mb/sec) = 488.8MB/Sec
Total Write Throughput Across Clusters(Number of files * Throughput mb/sec) = 53.82 MB/Sec<br />

Hope This helps

Happy Benchmarking !!!

Sunil S Ranka
“Superior BI is the antidote to Business Failure”

About these ads

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.

Join 44 other followers

%d bloggers like this: