Cloud computing is becoming an increasingly admired paradigm that delivers high-performance computing resources over the Internet to solve the large-scale scientific problems, but still it has various challenges that need to be addressed to execute scientific workflows.
The existing research mainly focused on minimizing finishing time (makespan) or minimization of cost while meeting the quality of service requirements. However, most of them do not consider essential characteristic of cloud and major issues, such as virtual machines (VMs) performance variation and acquisition delay.
In this paper, we propose a meta-heuristic cost effective genetic algorithm that minimizes the execution cost of the workflow while meeting the deadline in cloud computing environment. We develop novel schemes for encoding, population initialization, crossover, and mutations operators of genetic algorithm.
Our proposal considers all the essential characteristics of the cloud as well as VM performance variation and acquisition delay. Performance evaluation on some well-known scientific workflows, such as Montage, LIGO, CyberShake, and Epigenomics of different size exhibits that our proposed algorithm performs better than the current state-of-the-art algorithms.
Architecture for Workflow Scheduling in Cloud:
In this section, we introduce workflow scheduling models such as Application and Cloud Resource Model.
A. Application Model:
A deadline constrained scientific workflow is represented by DAGs W=( T ,E ) Where T=(t1, t2………tm) is the set of tasks and E is the set of edges.
B. Cloud Resource Model:
Our Cloud model consists of an IaaS service provider, which delivers high-performance computing resources in the form of Virtual Machines (VMs) over the internet to execute large scale scientific workflows.
Workflow scheduling in cloud may have several objectives.In this paper, we proposed a meta-heuristic optimization approach, Cost Effective Genetic Algorithm (CEGA) work flow scheduling algorithm. The goal of this proposal is to find a feasible schedule to execute a workflow on cloud computing environment such that overall execution cost is minimized while meeting deadline constraint.
THE PROPOSED COST EFFECTIVE GENETIC ALGORITHM:
Genetic Algorithms (GAs) are meta-heuristic algorithm inspired by evolutionary ideas of natural selection and genetics. GAs is applying in optimization problems, according to initialization, selection, and generic operators in many sciences and engineering domains such as pattern recognition, image processing, data mining and among others.
A. Basic Definitions:
Given a DAG-based workflow application is represented as DAGs W=( T ,E ) Where T=(t1, t2………tm) is the set of tasks and E is the set of edges with a user specified deadline D.
There exist three main groups for representing chromosome of workflow scheduling problem: TasktoIndex, TasktoVM mapping, and VMtoType.
C. Initial Population:
In this section, we present the details about the experiment conducted to evaluate the performance of the proposed Cost Effective Genetic Algorithm (CEGA).
The CEGA’s performance is assessed on four different scientific workflows such as Montage, LIGO, Epigenomics, and CyberShake.
Cloud computing delivers high performance computing resources over the internet to solve large scale scientific workflows. To execute these large scales scientific application cloud computing makes appropriate provisioning and scheduling decision in such a manner that total execution cost is inimized while meeting the deadline constraint.
Toward this, a cost effective meta-heuristics Cost Effective Genetic Algorithm (CEGA) have been proposed. The CEGA algorithm considers all the characteristics of the cloud such as heterogeneity, on-demand resource provisioning and pay-as-you-go price model as well as some major issues such as VMs performance variation and booting time.
Further, to achieve this, we develop novel schemes for encoding, population initialization, crossover, and mutation operators of the Genetic Algorithm. The simulation experiments conducted on four scientific workflows show that in comparison to state-of-art algorithms, such as IC-PCP, RCT, RTC and PSO.
The proposed algorithm CEGA exhibits the highest hit rate for deadline constraint. Although, the algorithm have lower execution time than IC-PCP, RCT and PSO and lower execution cost than RTC, RCT, PSO for deadline constraint.
In the future, we would like to consider other issue shut-down time (termination delay) of VMs, because it will affects the overall execution cost of the workflow. Further, we will consider the VMs to execute tasks of the workflow that are deployed in different regions and their data transfer costs between different data centers.
Finally, our goal is to implement our algorithm in a real cloud computing environments where as the workflow engine is running so that it can be utilized some deploying applications.
Source: National Institute of Technology
Authors: Jasraj Meena | Malay Kumar | Manu Vardhan