EECluster: An Energy-Efficient Tool for managing HPC Clusters

Luciano Sánchez, Jose Ranilla, Alberto Cocaña-Fernández

Abstract


High Performance Computing clusters have become a very important element in research, academic and industrial communities because they are an excellent platform for solving a wide range of problems through parallel and distributed applications. Nevertheless, this high performance comes at the price of consuming large amounts of energy, which combined with notably increasing electricity prices are having an important economical impact, driving up power and cooling costs and forcing IT companies to reduce operation costs. To reduce the high energy consumptions of HPC clusters we propose a tool, named EECluster, for managing the energy-efficient allocation of the cluster resources, that works with both OGE/SGE and PBS/TORQUE Resource Management Systems (RMS) and whose decision-making mechanism is tuned automatically in a machine learning approach. Experimental studies have been made using actual workloads from the Scientific Modelling Cluster at Oviedo University and the academic-cluster used by the Oviedo University for teaching high performance computing subjects to evaluate the results obtained with the adoption of this tool.

Keywords


Energy-efficient of HPC cluster; Multi-criteria decision making}; Evolutionary algorithms

References


Buyya, R., Jin, H., Cortes, T.: Cluster computing. Future Generation Computer Systems 18(3), (2002).

Yeo, CheeShin and Buyya, Rajkumar and Pourreza, Hossein and Eskicioglu, Rasit and Graham, Peter and Sommers, F.: Cluster Computing: High-Performance, High-Availability, and High-Throughput Processing on a Network of Computers. In: Zomaya, A. (ed.) Handbook of Nature-Inspired and Innovative Computing, pp. 521-551. Springer, (2006).

U.S. Environmental Protection Agency: Report to Congress on Server and Data Center Energy Efficiency Public Law 109-431. Technical report, ENERGY STAR Program (2007). http://www.energystar.gov/ia/partners/prod_development/downloads/EPA_Datacenter_Report_Congress_Final1.pdf?7e9c-bbd7

Eurostat: Electricity and natural gas price statistics - Statistics Explained (2013). http://epp.eurostat.ec.europa.eu/statistics_explained/index.php/Electricity_and_natural_gas_price_statistics#Further_Eurostat_information. Accessed 07/04/14

EIA: Electric Power Monthly - Energy Information Administration. http://www.eia.gov/electricity/monthly/. Accessed 07/04/14

Ebbers, Mike Archibald, M., da Fonseca, C.F.F., Griffel, M., Para, V., Searcy, M.: Smarter Data Centers: Achieving Greater Efficiency. Technical report, IBM Redpaper (2011). http://www.redbooks.ibm.com/abstracts/redp4413.html

The Economist Intelligence Unit: IT and the environment A new item on the CIOs agenda? Technical report, The Economist (2007). http://www-03.ibm.com/services/ca/fr/green/pdf/SOLUTION_IT_it_and_the_environment.pdf

Hsu, C.-H., Kremer, U.: The design, implementation, and evaluation of a compiler algorithm for CPU energy reduction. ACM SIGPLAN Notices 38(5), 38 (2003). doi:10.1145/780822.781137

Hsu, C.-H., Feng, W.-C.: A Power-Aware Run-Time System for High-Performance Computing. In: ACM/IEEE SC 2005 Conference (SC'05), pp. 1-1. IEEE, (2005).

Freeh, V.W., Lowenthal, D.K., Pan, F., Kappiah, N., Springer, R., Rountree, B.L., Femal, M.E.: Analyzing the Energy-Time Trade-Off in High-Performance Computing Applications. IEEE Transactions on Parallel and Distributed Systems 18(6), 835-848 (2007).

Lim, M., Freeh, V., Lowenthal, D.: Adaptive, Transparent Frequency and Voltage Scaling of Communication Phases in MPI Programs. In: ACM/IEEE SC 2006 Conference (SC'06), pp. 14-14. IEEE, (2006).

Cheng, Y., Zeng, Y.: Automatic Energy Status Controlling with Dynamic Voltage Scaling in Power-Aware High Performance Computing Cluster. In: 2011 12th International Conference on Parallel and Distributed Computing, Applications and Technologies, pp. 412-416. IEEE, (2011).

Ge, R., Feng, X., Feng, W.-c., Cameron, K.W.: CPU MISER: A Performance-Directed, Run-Time System for Power-Aware Clusters. In: 2007 International Conference on Parallel Processing (ICPP 2007), pp. 18-18. IEEE, (2007).

Huang, S., Feng, W.: Energy-Efficient Cluster Computing via Accurate Workload Characterization. In: 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, pp. 68-75. IEEE, (2009).

Chetsa, G.L.T., Lefrvre, L., Pierson, J.-M., Stolf, P., Da Costa, G.: A Runtime Framework for Energy Efficient HPC Systems without a Priori Knowledge of Applications. In: 2012 IEEE 18th International Conference on Parallel and Distributed Systems, pp. 660-667. IEEE, (2012).

Alonso, P., Badia, R.M., Labarta, J., Barreda, M., Dolz, M.F., Mayo, R., Quintana-Orti, E.S., Reyes, R.: Tools for Power-Energy

Modelling and Analysis of Parallel Scientific Applications. In: 2012 41st International Conference on Parallel Processing, pp. 420-429. IEEE, (2012).

Schubert, S., Kostic, D., Zwaenepoel, W., Shin, K.G.: Profiling Software for Energy Consumption. In: 2012 IEEE International Conference on Green Computing and Communications, pp. 515-522. IEEE, (2012).

Freeh, V.W., Lowenthal, D.K.: Using multiple energy gears in MPI programs on a power-scalable cluster. In: Proceedings of the Tenth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming - PPoPP '05, p. 164. ACM Press, New York, USA (2005).

Li, D., Nikolopoulos, D.S., Cameron, K., de Supinski, B.R., Schulz, M.: Power-aware MPI task aggregation prediction for high-end computing systems. In: 2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS), pp. 1-12. IEEE, (2010).

Xian, C., Lu, Y.-H., Li, Z.: A programming environment with runtime energy characterization for energy-aware applications. In: Proceedings of the 2007 International Symposium on Low Power Electronics and Design - ISLPED '07, pp. 141-146. ACM Press, New York, USA (2007).

Zong, Z., Ruan, X., Manzanares, A., Bellam, K., Qin, X.: Improving Energy-Efficiency of Computational Grids via Scheduling. In: Antonopoulos, N., Exarchakos, G., Li, M., Liotta, A. (eds.) Handbook of Research on P2P and Grid Systems for Service-Oriented Computing. IGI Global, Chap. 22, (2010).

Zong, Z., Nijim, M., Manzanares, A., Qin, X.: Energy efficient scheduling for parallel applications on mobile clusters. Cluster Computing 11(1), 91-113. (2007).

Bash, C., Forman, G.: Cool job allocation: measuring the power savings of placing jobs at cooling-efficient locations in the data center, p. 29. USENIX Association, (2007).

Tang, Q. and Gupta, S. K S and Varsamopoulos, G.: Energy-Efficient Thermal-Aware Task Scheduling for Homogeneous High-Performance Computing Data Centers: A Cyber-Physical Approach. IEEE Transactions on Parallel and Distributed Systems 19(11), 1458-1472 (2008).

Pinheiro, E., Bianchini, R., Carrera, E.V., Heath, T.: Load balancing and unbalancing for power and performance in cluster-based systems. In: Workshop on Compilers and Operating Systems for Low Power, vol. 180, pp. 182-195 (2001). Barcelona, Spain.

Das, R., Kephart, J.O., Lefurgy, C., Tesauro, G., Levine, D.W., Chan, H.: Autonomic multi-agent management of power and performance in data centers, 107-114 (2008)

Elnozahy, E.N., Kistler, M., Rajamony, R.: Energy-efficient server clusters, 179-197 (2002)

Berral, J.L., Goiri, I.n., Nou, R., Julià, F., Guitart, J., Gavaldà, R., Torres, J.: Towards energy-aware scheduling in data centers using machine learning. In: Proceedings of the 1st International Conference on Energy-Efficient Computing and Networking - e-Energy '10, p. 215. ACM Press, New York, USA (2010).

Lang, W., Patel, J.M., Naughton, J.F.: On energy management, load balancing and replication. ACM SIGMOD Record 38(4), 35 (2010).

Garcia, D.F., Entrialgo, J., Garcia, J., Garcia, M.: A self-managing strategy for balancing response time and power consumption in heterogeneous server clusters. In: 2010 International Conference on Electronics and Information Engineering, vol. 1, pp. 1-5371541. IEEE, (2010).

Llamas, R.M., Garcia, D.F., Entrialgo, J.: A Technique for Self-Optimizing Scalable and Dependable Server Clusters under QoS Constraints. In: 2012 IEEE 11th International Symposium on Network Computing and Applications, pp. 61-66. IEEE, (2012).

Alvarruiz, F., de Alfonso, C., Caballer, M., Hernández, V.: An Energy Manager for High Performance Computer Clusters. In: 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications, pp. 231-238. IEEE, (2012).

Dolz, M.F., Fernández, J.C., Iserte, S., Mayo, R., Quintana-Ortí, E.S., Cotallo, M.E., Díaz, G.: EnergySaving Cluster experience in CETA-CIEMAT. In: 5th Iberian GRID Infrastructure Conference, Santander (2011)

Xue, Z., Dong, X., Ma, S., Fan, S., Mei, Y.: An Energy-Efficient Management Mechanism for Large-Scale Server Clusters. In: The 2nd IEEE Asia-Pacific Service Computing Conference (APSCC 2007), pp. 509-516. IEEE, (2007).

Cocaña-Fernández, A., Ranilla, J., Sánchez, L.: Energy-Efficient Allocation of Computing Node Slots in HPC Clusters through Evolutionary Multi-Criteria Decision Making. In: Proceedings of the 14th International Conference on Computational and Mathematical Methods in Science and Engineering, CMMSE 2014, pp. 318-330, (2014)

Cocañaa-Fernández, A., Ranilla, J., Sánchez, L.: Energy-Efficient Allocation of Computing Node Slots in HPC Clusters through Parameter Learning and Hybrid Genetic Fuzzy System Modelling. The Journal of Supercomputing (2014). doi:10.1007/s11227-014-1320-9


Full Text: PDF

Supplementary files

1. figure   >>  View
2. figure   >>  View
3. figure   >>  View
4. figure   >>  View
5. latex bib   >>  View

Refbacks