• Mar 31, 2016 News!Vol.5, No.5 has been indexed by EI (Inspec).   [Click]
  • Aug 02, 2016 News!IJIEE Vol. 6, No. 4 issue has been published online! 10 papers which cover 3 specific areas are published in this issue.   [Click]
  • May 10, 2016 News!Papers published in Vol.6, No.3 have all received dois from Crossref.
General Information
Editor-in-chief

 
Faculty of Science, University of Brunei Darussalam, Brunei Darussalam   
" It is a great honor to serve as the editor-in-chief of IJIEE. I'll work together with the editorial team. Hopefully, IJIEE will be recognized among the readers in the related field."
IJIEE 2014 Vol.4(6): 480-484 ISSN: 2010-3719
DOI: 10.7763/IJIEE.2014.V4.487

The Comprehensive Performance Rating for Hadoop Clusters on Cloud Computing Platform

Fong-Hao Liu, Ya-Ruei Liou, Hsiang-Fu Lo, Ko-Chin Chang, and Wei-Tsong Lee
Abstract— Virtualization platform solutions throughout the IT infrastructure are one important type of Green IT services in cloud data center. The Hadoop clusters composed of on-demand virtual infrastructure are used as an effective implementation of MapReduce for developing data intensive applications in cloud computing. Deploying Hadoop clusters on large numbers of data center virtual machines (VMs) can significantly increase its productivity and reduce both energy and resource consumption. However, the interference between the VMs is complicated and changing with the growth of data size. On the other hand, it would also decrease the performance of Map and Reduce tasks while using Hadoop clusters on virtual machines. In this paper, a Comprehensive Performance Rating (CPR) scheme is presented to probe the root causes of these problems, and solutions for VMs interference are introduced using data locality and excessive configuration parameters. Unlike previous solutions by customizing Hadoop native job scheduler, the proposed CPR scheme uses Hadoop configuration metrics revealed through Principal Component Analysis (PCA) method to guide the performance tuning work. The experimental data is resulted on a 20-node virtual cluster demonstration. The proposed CPR scheme performance is close to the measured execution time in different data size, cluster size and map tasks ratio.

Index Terms— Hadoop, mapreduce, data locality, principal component analysis.

Fong-Hao Liu, Ya-Ruei Liou, Hsiang-Fu Lo, and Ko-Chin Chang are with National Defense University, Taiwan (e-mail: superalf@gmail.com).

[PDF]

Cite: Fong-Hao Liu, Ya-Ruei Liou, Hsiang-Fu Lo, Ko-Chin Chang, and Wei-Tsong Lee, " The Comprehensive Performance Rating for Hadoop Clusters on Cloud Computing Platform," International Journal of Information and Electronics Engineering vol. 4, no. 6, pp. 480-484, 2014.

Copyright © 2008-2016. International Journal of Information and Electronics Engineering. All rights reserved.
E-mail: ijiee@ejournal.net