Energy efficient technique for Hadoop MapReduce cluster management
dc.contributor.author | Alalawi, Manal Tawalai | |
dc.date.accessioned | 2023-05-18T09:40:49Z | |
dc.date.available | 2023-05-18T09:40:49Z | |
dc.date.issued | 2020-03 | |
dc.identifier.citation | Alalawi, M.T. (2020) 'Energy Efficient Technique for Hadoop MapReduce Cluster Management'. PhD thesis. University of Bedfordshire. | en_US |
dc.identifier.uri | http://hdl.handle.net/10547/625869 | |
dc.description | A thesis submitted to the University of Bedfordshire, in partial fulfilment of the requirements for the degree of Doctor of Philosophy. | en_US |
dc.description.abstract | Big data analytics, with datasets of terabyte and petabyte size, is now a reality for businesses. A widely used solution for data centres is the MapReduce model on open‐source Hadoop. Many organisations processing real‐time data of this magnitude rely on the Hadoop MapReduce model, and the massive increase in data generation means that even small to medium enterprises (SMEs) have a requirement for big data analysis. The business insights gained from this real‐time data analysis are vital in the modern world, and although this can be outsourced to data centres, SMEs will be more sustainable if they can do this for themselves. However, the increase in the amount of data has resulted in a corresponding increase in the amount of energy used for processing. The need to minimise the use of energy, both in terms of cost and ecology, is the main rationale behind this research, and energy‐efficiency will be the key to sustainability in the twenty‐first century. The initial categorisation of energy‐efficient methods for Hadoop components has been the starting point for a comparative evaluation in this research. The research has used Hadoop MapReduce performance modelling in a series of mathematical analyses and experimental tests, and these have led to the identification and design of an energy‐efficient model. This proposed model uses a novel method of data partitioning using virtual chunks. The idea is that rather than accessing the entire data file, blocks, or chunks of data are accessed that are virtually linked. The accuracy and efficiency of the proposed design have been evaluated mathematically and the results presented graphically, and the method has been shown to minimise the processing time and complete the different data operations. This reduction of processing time has resulted in minimising the I/O bottleneck of workload applications, thus reducing the amount of energy needed for processing big data. This improved energyefficiency can be maintained for datasets of all sizes and in multiple applications. The results of this research are transferrable and can be used by SMEs of any kind in any area of business. | en_US |
dc.language.iso | en | en_US |
dc.publisher | University of Bedfordshire | en_US |
dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 International | * |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | * |
dc.subject | energy efficient | en_US |
dc.subject | Hadoop MapReduce clusters | en_US |
dc.subject | MapReduce performance model | en_US |
dc.subject | data intensive applications | en_US |
dc.subject | virtual chunks | en_US |
dc.subject | Subject Categories::G490 Computing Science not elsewhere classified | en_US |
dc.title | Energy efficient technique for Hadoop MapReduce cluster management | en_US |
dc.type | Thesis or dissertation | en_US |
dc.type.qualificationname | PhD | en_GB |
dc.type.qualificationlevel | PhD | en_US |
dc.publisher.institution | University of Bedfordshire | en_US |
refterms.dateFOA | 2023-05-18T09:40:50Z |