Simulating Cloud Deployment Options for Software Migration Support

Fittkau, Florian

Simulating Cloud Deployment Options for Software Migration Support

Summary

Cloud computing is emerging as a promising new paradigm that aims at delivering computing resources and services on demand. To cope with the frequently found over- and under-provisioning of resources in conventional data centers, cloud computing technologies enable to rapidly scale up and down according to varying workload patterns. However, most software systems are not built for utilizing this so called elasticity and therefore must be adapted during the migration process into the cloud.
Here, the selection of a specific cloud provider is the most obvious and basic cloud deployment option. Furthermore, the mapping between services and virtual machine instances must be considered when migrating to the cloud and the specific adaptation strategies, like allocating a new virtual machine instance if the CPU utilization is above a given threshold, have to be chosen and configured. The set of combinations of the given choices form a huge design space which is infeasible to test manually.
The simulation of a cloud deployment option can assist in solving this problem. A simulation is often faster than executing real world experiments. Furthermore, the adaptation to the software system that shall be migrated requires less effort at a modeling layer. The simulation can be utilized by an automatic optimization algorithm to find the best ratio between high performance and low costs.
Our main objective in this study is the implementation of a software that enables the simulation of cloud deployment options on a language independent basis.

Excerpt

5.4

Starting and Stopping Virtual Machine Instances on Demand . . . . .

5.5

Delayed Cloudlet Creation . . . . . . . . . . . . . . . . . . . . . . . .

5.6

Delayed Start of Virtual Machines . . . . . . . . . . . . . . . . . . . .

5.7

Timeout for Cloudlets

. . . . . . . . . . . . . . . . . . . . . . . . . .

5.8

Improved Debt Model

. . . . . . . . . . . . . . . . . . . . . . . . . .

5.9

Enhanced Instruction Count Model . . . . . . . . . . . . . . . . . . .

5.10 History Exporter . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.11 Dynamic Host Addition at Runtime . . . . . . . . . . . . . . . . . . .

5.12 Method Calls and Network Traffic between Virtual Machine Instances 54

MIPIPS and Weights Benchmark

6.1

Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.2

Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.3

Example Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CDOSim

7.1

Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7.2

The Simulation Process . . . . . . . . . . . . . . . . . . . . . . . . . .

7.3

Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Evaluation of CDOSim

8.1

Goals of the Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . .

8.2

Methodology

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8.3

Basic Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . .

8.4

E1: MIPIPS Benchmark Evaluation . . . . . . . . . . . . . . . . . . .

8.5

E2: Accuracy Evaluation for Single Core Instances

. . . . . . . . . .

8.6

E3: Accuracy Evaluation for Multi Core Instances . . . . . . . . . . .

8.7

E4: Accuracy Evaluation for Adaptation Strategy Configurations

. . 104

8.8

E5: Inter-Cloud Accuracy Evaluation . . . . . . . . . . . . . . . . . . 109

8.9

Summary

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

Related Work

115

9.1

GroudSim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

9.2

Palladio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

9.3

SLAstic.SIM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

9.4

iCanCloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

9.5

Byte Instruction Count for Java . . . . . . . . . . . . . . . . . . . . . 116

9.6

Measuring Elasticity . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

9.7

Dhrystone Benchmark . . . . . . . . . . . . . . . . . . . . . . . . . . 117

9.8

Cloudstone Toolkit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

10 Conclusions and Future Work

119

10.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

10.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

References

121

A Glossary

B Ecore Model for MIPIPS and Weights Benchmark

iii

C KDM example

D Rating Algorithm

E Attachments

List of Figures

Users and providers of cloud computing taken from Armbrust et al. [2]

CloudMIG approach taken from Frey et al. [19] . . . . . . . . . . . .

CloudSim architecture taken from Calheiros et al. [10] . . . . . . . . .

CloudMIG Xpress overview taken from Frey et al. [19]

. . . . . . . .

Extracted CloudSim meta-model

. . . . . . . . . . . . . . . . . . . .

Layers of KDM taken from P´

erez-Castillo et al. [54] . . . . . . . . . .

Example of determining the median of response times during phases

of low CPU utilization in the dynamic approach . . . . . . . . . . . .

Enhanced CloudSim meta-model

. . . . . . . . . . . . . . . . . . . .

CPU utilization model example . . . . . . . . . . . . . . . . . . . . .

New scheduling example . . . . . . . . . . . . . . . . . . . . . . . . .

Java packages of the MIPIPS and weights benchmark . . . . . . . . .

GUI of the MIPIPS and weights benchmark . . . . . . . . . . . . . .

Activities in CDOSim's simulation process . . . . . . . . . . . . . . .

Java packages of CDOSim . . . . . . . . . . . . . . . . . . . . . . . .

GUI of CDOSim

. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Deployment configuration for Eucalyptus . . . . . . . . . . . . . . . .

Deployment configuration for Amazon EC2 . . . . . . . . . . . . . . .

The used day-night-cycle workload intensity . . . . . . . . . . . . . .

Average CPU utilization of allocated nodes in SingleCore.1 experiment 83

Median of response times in SingleCore.1 experiment . . . . . . . . .

Average CPU utilization of allocated nodes in SingleCore.2 experiment 84

Median of response times in SingleCore.2 experiment . . . . . . . . .

Average CPU utilization of allocated nodes in SingleCore.3 experiment 86

Median response times in SingleCore.3 experiment . . . . . . . . . . .

Average CPU utilization of allocated nodes in SingleCore.4 experiment 87

Median response times in SingleCore.4 experiment . . . . . . . . . . .

Average CPU utilization of allocated nodes in SingleCore.5 experiment 89

Median response times in SingleCore.5 experiment . . . . . . . . . . .

Average CPU utilization of allocated nodes in SingleCore.6 experiment 91

Median response times in SingleCore.6 experiment . . . . . . . . . . .

Average CPU utilization of allocated nodes in SingleCore.7 experiment 92

Median response times in SingleCore.7 experiment . . . . . . . . . . .

Average CPU utilization of allocated nodes in SingleCore.8 experiment 94

Median response times in SingleCore.8 experiment . . . . . . . . . . .

Average CPU utilization of allocated nodes in MultiCore.1 experiment 101

Median response times in MultiCore.1 experiment . . . . . . . . . . . 101

Average CPU utilization of allocated nodes in MultiCore.2 experiment 102

Median response times in MultiCore.2 experiment . . . . . . . . . . . 103

Average CPU utilization of allocated nodes in Adaptation.1 experiment105

Median response times in Adaptation.1 experiment . . . . . . . . . . 106

Average CPU utilization of allocated nodes in Adaptation.2 experiment107

Median response times in Adaptation.2 experiment . . . . . . . . . . 107

Average CPU utilization of allocated nodes in PredictionAmazon.1

experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

Median response times in PredictionAmazon.1 experiment . . . . . . 111

Ecore model for MIPIPS and weights benchmark as UML class diagram iii

List of Tables

Overview of the preconditions for each instruction count derivation

approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Example weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Contained weight benchmarks . . . . . . . . . . . . . . . . . . . . . .

Simulation configuration parameters

. . . . . . . . . . . . . . . . . .

Our Eucalyptus server . . . . . . . . . . . . . . . . . . . . . . . . . .

Our Eucalyptus configuration . . . . . . . . . . . . . . . . . . . . . .

Used instance types in Amazon EC2 experiments . . . . . . . . . . .

Default simulation configuration . . . . . . . . . . . . . . . . . . . . .

Results for comparison MIPIPS.1 . . . . . . . . . . . . . . . . . . . .

Results for comparison MIPIPS.2 . . . . . . . . . . . . . . . . . . . .

Results for comparison MIPIPS.3 . . . . . . . . . . . . . . . . . . . .

Results for comparison MIPIPS.4 . . . . . . . . . . . . . . . . . . . .

Results for comparison MIPIPS.5 . . . . . . . . . . . . . . . . . . . .

Overview of the relative error values for each scenario . . . . . . . . . 113

Introduction

1.1

Motivation

Cloud computing is emerging as a promising new paradigm that aims at deliv-

ering computing resources and services on demand. To cope with the frequently

found over- and under-provisioning of resources in conventional data centers, cloud

computing technologies enable to rapidly scale up and down according to varying

workload patterns. However, most software systems are not built for utilizing this

so called elasticity and therefore must be adapted during the migration process into

the cloud [46].

Here, the selection of a specific cloud provider is the most obvious and basic

cloud deployment option. Furthermore, the mapping between services and virtual

machine instances must be considered when migrating to the cloud and the specific

adaptation strategies, like allocating a new virtual machine instance if the CPU

utilization is above a given threshold, have to be chosen and configured. The set

of combinations of the given choices form a huge design space which is infeasible to

test manually [25].

The simulation of a cloud deployment option can assist in solving this problem.

A simulation is often faster than executing real world experiments. Furthermore,

the adaptation to the software system, that shall be migrated, requires less effort

at a modeling layer. The simulation can be utilized by an automatic optimization

algorithm to find the best ratio between high performance and low costs.

1.2

Approach

We begin with defining the fundamental concept of a cloud deployment option and

describe our simulation approach.

Definition 1

In the context of a deployment of software on a cloud platform, a

cloud deployment option is a combination of decisions concerning the selection of

a cloud provider, the deployment of components to virtual machine instances, the

virtual machine instances' configuration, and specific adaptation strategies.

Definition 1 shows our definition of a cloud deployment option. The deployment

of components to virtual machine instances includes the case that new components

might be formed of parts of already existing components. By a virtual machine

instances' configuration, we refer to the instance type, as m1.small in the case of

Amazon EC2, of virtual machine instances, for instance. Furthermore, an example

for an adaptation strategy is "start a new virtual machine instance when for 60

seconds the average CPU utilization of allocated nodes stays above 70 %."

For simulating a cloud deployment option, we basically need a cloud environment

simulator. For this purpose, we utilize CloudSim [10]. There are various inputs that

are required by CloudSim. For modeling a computation like an application call,

named Cloudlet in CloudSim, CloudSim mainly requires the instruction count of

the computation. The instruction count of a Cloudlet is a measure for the work that

has to be conducted by the CPU. As a central input for modeling the capacity of

virtual machine instances, CloudSim needs the mega instructions per second (MIPS)

of the virtual machine instance. MIPS are a measure for the computing performance

of the virtual machine instance. CloudSim does neither define a method for deriving

the instruction count nor the MIPS. Furthermore, CloudSim does not specify which

instructions are meant.

We assume that CloudSim requires instructions on a language level, e.g., double

divide and integer minus, and that these instructions all equally flow into the MIPS

value. Hence, we consider MIPS as too coarse grained because different instructions

have different runtimes in general. Therefore, we define the measure mega integer

plus instructions per second (MIPIPS). The measurement of MIPIPS should be

separate from the actual simulation software because it has to be run on the virtual

machine instances to measure their MIPIPS, for example. In accordance to MIPIPS,

the instruction count unit of a Cloudlet has to be in integer plus instructions. Other

instruction types must be converted to these integer plus instructions by weights

that will also be measured separately from the actual simulation software.

To rate the suitability of a specific cloud deployment option, the simulation

has to compute some information like costs for the given cloud deployment option.

Furthermore, the outputs of a simulation run have to be comparable to the outputs

of other simulation runs. This leads to the need for a rating approach.

A further requirement for the simulation results from the wide range of program-

ming languages supported by different cloud providers. Infrastructure-as-a-Service

(IaaS) providers typically support all programming languages because they are only

providing the infrastructure computing resources. Therefore, we need a language in-

dependent simulation. For this purpose, we utilize the Knowledge Discovery Meta-

Model (KDM) that provides information about the existing software system in a

language independent way.

CloudMIG [15] provides a promising approach to assist in a migration project to

a cloud environment. There also exists a prototype implementation, called Cloud-

MIG Xpress [18], that implements this approach. Our software, named Cloud De-

ployment Options Simulator (CDOSim), for realizing the simulation contributes to

CloudMIG Xpress as a plug-in. It utilizes workload profiles that can be modeled

by the user or can be imported from monitoring data that were recorded by, for

instance, Kieker [70].

1.3

Goals

Our main objective is a software that enables the simulation of cloud deployment

options on a language independent basis. For this purpose, we define the following

goals.

1.3.1

G1: Definition of the Simulation Input

The definition of the simulation input should be accomplished by goal G1. MIPIPS

and instruction count was already described as an input. However, there are more.

Furthermore, where appropriate, derivation methods for the input parameter should

be developed or defined.

1.3.2

G2: Definition of the Simulation Output

In goal G2 the output of the simulation should be defined. Furthermore, a metric

for comparing the cloud deployment options in respect to the output should be

developed.

1.3.3

G3: Development of a Benchmark for Measuring the Computing

Performance of a Node in MIPIPS

In G3 a benchmark for measuring the computing performance of a node in MIPIPS,

that can be easily adapted to new programming languages, shall be developed. It

shall include a GUI and a console interface because virtual machine instances can

often only be accessed via a command shell.

1.3.4

G4: Development of CDOSim

The last goal is the development of a software that realizes the simulation. Fur-

thermore, it shall be integrated into CloudMIG Xpress as a plug-in. We name this

software CDOSim. To achieve the programming language independence, CDOSim

shall operate on KDM instances.

1.4

Document Structure

The remainder of the thesis is structured as follows. Section 2 outlines the founda-

tions and utilized technologies. Afterwards, Section 3 presents the simulation inputs

and how they can be derived (G1). Then, Section 4 describes the simulation out-

put (G2) and a rating approach for rating simulation runs relatively to each other.

The enhancements we needed to conduct for CloudSim are listed in Section 5. The

following Section 6 describes our MIPIPS and weights benchmark (G3). Our devel-

oped tool for simulating cloud deployment options, named CDOSim, is discussed in

Section 7 (G4). The following Section 8 evaluates the functionality and accuracy of

CDOSim. Then, Section 9 describes related work. The final Section 10 concludes

the thesis and defines the future work.

Foundations and Technologies

Sections 2.1 to 2.2 provide an overview of the foundations and technologies that will

be used in later sections.

2.1

Foundations

The following Sections 2.1.1 to 2.1.5 describe the foundations.

2.1.1

Cloud Computing

Cloud computing is a relatively new computing paradigm. Therefore, many defini-

tions for cloud computing exist. Here, we use the National Institute of Standards

and Technology (NIST) definition by Mell and Grance [42] because this definition

has become a de-facto standard.

The NIST definition for cloud computing defines five essential characteristics

that a service must fulfill in order to be a cloud service, for example, on-demand

self-service. Furthermore, it describes three different service models. These are IaaS,

Platform-as-a-Service (PaaS), and Software-as-a-Service (SaaS). They differ in the

levels of abstraction with regard to configuration and programming options. Clouds

can be deployed according to four different deployment models. These are public

clouds, private clouds, hybrid clouds, and community clouds. In addition, Armbrust

et al. [2] define different role models for users and providers of cloud computing

services.

Essential Characteristics

The NIST definition for cloud computing defines five essential characteristics that

a service must fulfill in order to be a cloud service. These are listed and described

below.

1. On-demand self-service

A user can rent computing capabilities like storage and computing time on demand

in an automatic way without human interaction of the service provider.

2. Broad network access

The capabilities can be accessed over the network by standard mechanisms. These

standard mechanisms are available on heterogeneous platforms like mobile phones

and laptops.

3. Resource pooling

The cloud provider's computing resources are pooled to serve multiple cloud users.

The location, where the physical or virtual resources are allocated, is not exactly

known by the cloud users.

4. Rapid elasticity

Virtually unlimited resources can be rapidly and elastically allocated to enable quick

scale up and down. It can be purchased by the cloud users in any quantity at any

time.

5. Measured Service

By monitoring the usage, the cloud system automatically controls and optimizes the

used resources. For the cloud provider and cloud users, transparency is provided by

monitoring, controlling, and reporting the resource usage data.

Service Models

The cloud providers can offer their service at different levels of abstraction with

regard to configuration and programming options. The different kinds of service

models are described in the following three paragraphs.

Infrastructure-as-a-Service (IaaS)

Infrastructure-as-a-Service provides the lowest level of abstraction with a maximum

of configuration options compared to the other service models. In IaaS, the cloud

user setups and runs instances of previously created or provided virtual machine

images. Therefore, the cloud user can create the full software stack by himself. A

popular cloud provider that offers IaaS is, for instance, Amazon with its Elastic

Compute Cloud (EC2).

Platform-as-a-Service (PaaS)

Considering the PaaS model, the cloud provider defines and maintains the program-

ming environment for the cloud user. Many PaaS providers only support specific

programming languages with even more constraints to meet the environment spec-

ifications. Examples for PaaS providers are Google App Engine [21] and Microsoft

Azure [45].

Software-as-a-Service (SaaS)

SaaS provides the highest level of abstraction with no configuration options apart

from the rented software. The cloud user rents access to the software in the cloud.

The cloud user advantages can be avoided installation and maintenance effort, for

instance. Examples for SaaS-based products are Google Docs or Microsoft Office

Live.

Deployment Models

Clouds can be deployed using four different deployment models. These are public

clouds, private clouds, hybrid clouds, and community clouds. These deployment

models are briefly outlined in the next four paragraphs.

Public Clouds

In a public cloud, the cloud infrastructure can be accessed by the general public.

For instance, Amazon provides a public cloud named Amazon EC2.

Private Clouds

Public clouds can have disadvantages for some users. First, there might be legal

aspects that prohibit to use public clouds for data protection reasons. Furthermore,

cloud providers can go bankrupt. For avoiding those disadvantages, private cloud

software can be deployed on the own servers. An example for a private cloud software

is Eucalyptus.

Hybrid Clouds

In this deployment model, private and public cloud providers are used together by a

cloud user. Companies often use this kind to combine the advantages of public and

private clouds. The privacy-critical applications are executed in a private cloud and

the rest of the applications are run in a public cloud.

Community Clouds

The last deployment model is a community cloud. This kind of a cloud provides

access only to a special community.

Role Models

Armbrust et al. [2] define the role models cloud provider, SaaS provider, cloud user,

and SaaS user. The associations between them is shown in Figure 1. A cloud

provider offers the cloud users the resources in terms of utility computing. Thus,

he provides the resources on an IaaS or PaaS service basis. A special kind of cloud

user is a SaaS providers. The SaaS provider makes SaaS services available to SaaS

users through web applications. The NIST defines similar role models [41].

Figure 1: Users and providers of cloud computing taken from Armbrust et al. [2]

2.1.2

Software Modernization

Jha and Maheshwari [32] propose a classification of current modernization approaches.

They identified three main approaches, i.e., redevelopment, wrapping, and migra-

tion. Redevelopment includes the rewriting from scratch approach and the reverse

engineering approach. For reverse engineering of the legacy code, it often has to be

understood first before rewriting it [12]. For representing the extracted information

about the legacy source code and architecture a language independent meta-model

like the KDM can be used [31]. The wrapping approach is divided into user inter-

face wrapping, data wrapping, and function wrapping. In each wrapping approach

the corresponding issue is wrapped so that the new system can access them. The

migration approach is divided into component migration and system migration. In

component migration, each component is migrated separately. In system migration,

the whole legacy system is migrated at once.

There are different studies that researched which criteria lead to a software mod-

ernization decision [1, 38]. The three most relevant criteria are system usability,

ending of technological support, and changes in business processes according to

Koskinen et al. [38].

2.1.3

CloudMIG Approach

CloudMIG is a migration approach for software systems into the cloud developed

by Frey et al. [15, 16, 17, 18, 19]. It comprises of six steps that are illustrated in

Figure 2. These steps are described in the following.

A1 - Extraction

This step extracts architectural and utilization models of the legacy software sys-

tem. The extraction utilizes KDM and Structured Metrics Meta-Model (SMM) as

a language independent representation of a legacy software system and its quality

attributes.

A2 - Selection

In the selection step an appropriate cloud profile candidate is chosen. Criteria for

the decision can be a preference towards one cloud provider or a feature that has to

be supported.

A3 - Generation

The output of the generation step is a generated target architecture and mapping

model. In addition, the cloud environment constraint violations are detected in this

step. A violation describes the breaking of a limitation of a specific cloud provider,

for instance.

A4 - Adaptation

A reengineer might disagree with some aspects of the generated target architecture.

Therefore, he can adjust them manually in this step.

A5 - Evaluation

The evaluation step simulates the performance and costs of the generated target

architecture and evaluates it basing on the results.

A6 - Transformation

The actual transformation towards the generated target architecture. Currently,

CloudMIG does not provide further support for performing this step. Thus, the

source code and other artifacts have to be adopted manually.

Figure 2: CloudMIG approach taken from Frey et al. [19]

, (

) ,

, (

) ,

, (

)

. . .

Listing 1: Evolution of a system

2.1.4

Simulation

A computer simulation is a program that attempts to emulate a particular system.

One type of simulation is discrete-event simulation [57]. In discrete-event simulation

the evolution of a system is viewed as a sequence of the form shown in Listing 1. A

system starts in the state

. Then, the event

occurs at the timestamp

which

results the system to be in the state

and so on. The timestamps

, where

i is

larger than 0, have to be nonnegative numbers and the

's have to be nondecreasing.

With such a sequence representing an evolution of a given system, we can conclude

properties of the system, e.g., if it reaches a steady state.

Thus, we can draw

conclusions about the real system.

Entities

Entities are models of real world objects.

In CloudSim (see Section 2.2.1), for

example, a data center is an entity. The former mentioned state of the simulation

model is the state of all entities' attributes. If an attribute changes due to the

occurrence of an event, a new state is entered. Furthermore, an entity can provide

methods for triggering the change of its attributes or to generate new events.

Events

While the simulation is active, external or internal events are produced and sent to

the entities at a specific timestamp. If the timestamp lies in the future in respect to

the actual simulation time, the event is scheduled until the simulation time reaches

the specific timestamp. The scheduler maintains a queue of pending events. The

events in the queue are processed successively. When the queue is empty, the simula-

tion typically terminates, if it does not expect further external events. In CloudSim,

for instance, the triggering for the creation of a virtual machine instance is an event.

Time

In a simulation, we typically use a model time for representing the real time. Using

a model time has different advantages. It provides more control over time because

we need not care about execution time of calculations.

Furthermore, with this

abstraction from the real world, we conduct simulations faster than in real time in

most cases. A simulation can take 10 minutes in real time for simulating a real world

system evolution for, e.g., 24 hours. The model time advances while processing the

events from the event scheduler.

2.1.5

Model Transformation

Czarnecki and Helsen [13] state two different kinds of model transformation, i.e.,

model-to-code and model-to-model transformations.

In model-to-code transfor-

mations the authors distinguish visitor-based approaches and template-based ap-

proaches. Visitor-based approaches provide a visitor mechanism that generates code.

Template-based approaches use templates to generated code. The templates typ-

ically consist of target text with metacode to access information from the source

model. The authors distinguish model-to-model transformations by six different

kinds.

These are direct-manipulation approaches, relational approaches, graph-

transformation-based approaches, structure-driven approaches, hybrid approaches,

and other model-to-model approaches. For further details of these approaches refer

to Czarnecki and Helsen [13].

Mens and Van Gorp [44] mention important characteristics of a model transfor-

mation. These are the level of automation, complexity of the transformation, and

preservation. The level of automation should be classified as manually, often manual

intervention needed, or automated. Considering the complexity of the transforma-

tion, the classification can range from small to heavy-duty transformations which

require other tools and techniques. A model transformation should define the preser-

vation that it keeps. For example, refactoring preserves the behavior but alters the

structure.

Query/View/Transformation (QVT) [23, 53] is a standard for model transforma-

tions established by the Object Management Group (OMG). It defines three related

model transformation languages. These are Relations, Operational Mappings, and

Core. The QVT specification integrates Object Constraint Language (OCL) 2.0 and

is a hybrid of declarative and imperative. It requires Meta Object Facility (MOF)

2.0 models to operate on. Atlas Transformation Language (ATL) is a QVT-like

transformation language and is described in Section 2.2.5.

2.2

Involved Technologies

The following sections provide an overview of the technologies that are relevant in

the context of our work.

2.2.1

CloudSim

CloudSim is a cloud computing system and application simulator developed by Cal-

heiros et al. [7, 9, 10]. It provides different novel features. The first feature is the

support of modeling and simulating large cloud computing environments with a sin-

gle computer. The second feature is a self-contained platform for modeling clouds,

the service brokers, and different policies for allocation, for instance. CloudSim also

provides support for network connection simulation between nodes as the third main

feature. Finally, it offers a facility for simulation of a federated cloud environment.

CloudSim has been successfully used by other researchers for simulating task

scheduling in the cloud or power aware cloud computing [4, 8, 36, 58, 60], for instance.

The architecture of CloudSim is illustrated in Figure 3. The basis is formed by

the CloudSim core simulation engine. It is used by the network, cloud resources,

Figure 3: CloudSim architecture taken from Calheiros et al. [10]

cloud services, VM services and user interface structure. Here, so called Cloudlets

constitute an important concept. CloudSim uses Cloudlets to simulate the appli-

cation's execution by defining the total instruction count the application runtime

would need. On the top, the user can specify the scheduling type (space-shared or

time-shared) and other custom configurations.

CloudSim Meta-Model

We extracted the meta-model of CloudSim because there was no Ecore [65] model

available. Figure 5 shows this meta-model as a Unified Modeling Language (UML)

class diagram. The classes and associations left besides the class link mostly model

the physical resources and the right side contains classes for modeling the virtual

machines and Cloudlets. The link class provides a network link with a specific

latency and bandwidth between data centers and data center brokers.

We start by describing the left part. A data center has storage and specific

data center characteristics like the timezone or the virtual machine monitor. Fur-

thermore, a data center has a virtual machine allocation policy that determines on

which host a new virtual machine should be created. The virtual machine alloca-

tion policy and the data center characteristics share a list of available hosts that

in the real world would be part of the data center. An important part of a host is

a list of processing elements (PE). A PE has a PE provisioner that provides the

MIPS, which is a measure for the computing performance. Furthermore, a host has

a bandwidth provisioner, RAM provisioner, and virtual machine scheduler.

In the right part of Figure 5, the data center broker has a major role. It is re-

sponsible for the creation of Cloudlets and triggers the creation of virtual machine

instances. Therefore, it maintains a list of created Cloudlets and virtual machine

instances. A virtual machine instance has different attributes like MIPS and RAM.

In addition, it is associated with a Cloudlet scheduler which is responsible for pro-

cessing Cloudlets. The most important attribute of a Cloudlet is the length. In

combination with the MIPS of a virtual machine instance, this attribute determines

how long the Cloudlet is processing. In addition, a Cloudlet has other attributes

and utilization models for RAM, CPU, and bandwidth.

In CloudSim, only data centers and data center brokers are simulation entities,

i.e., all events can only be processed by those classes.

2.2.2

CloudMIG Xpress

CloudMIG Xpress [19] is a prototype implementation of the CloudMIG approach

which was described in Section 2.1.3. It bases on the Eclipse Rich Client Platform.

Figure 4 illustrates an overview of CloudMIG Xpress. It exhibits a plug-in based

architecture and defines different interfaces for plug-ins that realize the steps A1, A3,

and A5 of CloudMIG. CloudMIG's internal data format is the Cloud Environment

Model (CEM). Currently, CloudMIG Xpress only supports the steps A1, A2, and

partly A3. Our work contributes the step A5.

Figure 4: CloudMIG Xpress overview taken from Frey et al. [19]

-id : i

-len

gth

-f

ile

ize

-o

tpu

e : l

-P

Ecou

nt : i

-r

ecord :

bool

let

-i

ageS

e : l

-R

AM :

int

-PEcoun

t : i

-M

S :

dou

-ban

dwi

Conn

cti

: l

-v

r :

lou

let

eSh

lou

loudl

dul

-n

: Stri

-sch

lin

nterv

l :

dou

ble

Dat

<<I

nterface

til

tio

tili

tio

ull

tiliz

tio

-i

d : i

-i

: i

-s

torage :

-archi

tectu

: Strin

-OS :

Stri

-v

Mon

: Stri

-ti

e :

dou

-cost

PerSe

: dou

-cost

PerM

: do

-cost

PerSto

rage :

dou

-cost

PerBw : dou

ter

-M

: dou

PeP

rov

isi

ple

-r

: in

rovi

-b

w :

rov

ion

rovi

<<I

nterface

rag

Allo

lic

-n

e : St

ter

-bw

dou

-l

atency

: do

..1

rId

1..*

1..

1..*

..*

srcI

stI

lou

tLi

llo

lic

stora

geLi

arac

teri

stics

peLi

ram

rov

ision

Provi

pePr

isi

oner

lizati

odel

liz

ati

ode

lRam

liz

del

Cpu

userI

edu

ler

Figure 5: Extracted CloudSim meta-model

2.2.3

Knowledge Discovery Meta-Model

KDM [22] was created by the OMG and was defined as an ISO standard in 2011 [54].

KDM maps information about software assets, their associations, and operational

environments into one common data interchange format. Then, different analysis

tools have a common base for interchanging information. Thereby, the different ar-

chitectural views, which can be extracted by various analysis tools, can be kept in

one meta-model. For this purpose KDM provides various levels of abstraction repre-

sented by entities and relations. This section provides an overview of the structure

and organization of KDM.

Figure 6: Layers of KDM taken from P´

erez-Castillo et al. [54]

Figure 6 shows the four layers of KDM. These four layers are split into several

packages. The remainder of the section describes the different layers and packages

of KDM.

Infrastructure Layer

This layer describes the core components of KDM with the core and kdm packages.

Every model in other layers inherits directly or indirectly from these components.

The source package is also contained in this layer. It models the physical resources

like source code files and images.

Program Layer

The program layer defines a language-independent representation of the existing

source code with the code and action package. The former defines model elements

for representing the logical structure, dependencies, classes, and methods. The latter

models behavioral aspects of the software system by describing the control and data

flow.

Resource Layer

Higher-level knowledge about the existing software system is represented in this

layer. It contains the data, event, UI, and platform package. The data package

handles persistent data aspects of an application. With the event package the dif-

ferent events that might occur can be modeled. The UI package contains elements

to model aspects of the user interface. Last, the platform package provides means

for modeling the artifacts that relate to the runtime platform.

Abstraction Layer

This layer contains the highest level of abstractions about the existing software

system. The contained packages are the conceptual, build, and structure package.

They represent knowledge about the conceptual perspective of the software system,

the build system, and the higher level structure like a UML component diagram of

the software system.

2.2.4

Structured Metrics Meta-Model

SMM [24] was developed in the context of the architecture-driven modernization

(ADM) taskforce by the OMG. The specification defines an extensible meta-model

for representing information regarding measurements relating to any structured

model that is MOF-conform. Furthermore, SMM includes elements that can be

used to express a wide range of software measures. Static and dynamic aspects of a

software system can be modeled with the metrics in SMM. The SMM specification

includes a minimal library of software measures for illustrative purposes.

2.2.5

Atlas Transformation Language

The Atlas Transformation Language (ATL) [33, 34] is a model-to-model transfor-

mation language. It is developed by the ATLAS INRIA and LINA research group.

There exists also a tool called ATL IDE [35, 50] that is a toolkit on the basis of

the Eclipse Modeling Framework (EMF). ATL can be used declaratively and im-

peratively, and is a QVT similar language. The preferred style of transformation

1 r u l e A t t r i b u t e 2 C o l u m n

{

from a t t r

: UML! A t t r i b u t e

t o

c o l

: DB! Column (

name

<- a t t r . name

)

}

Listing 2: ATL declarative example taken from B´

ezivin et al. [5]

IMPORT c l a s s d e f i n i t i o n

DEFINE javamain FOR Model

FILE "benchmarks / j a v a / " + t h i s . c l a s s . name + " . j a v a "

p u b l i c

c l a s s

t h i s . c l a s s . name

{

p u b l i c

s t a t i c v o i d main ( S t r i n g [ ]

a r g s )

{

System . ou t . p r i n t l n ( " H e l l o World " ) ;

}

ENDFILE

ENDDEFINE

Listing 3: Xpand hello world template for Java

is writing in the declarative way. However, for transformations that are hard to

express in a declarative way, it also provides the imperative style.

An ATL transformation is composed of rules. These rules define on which ele-

ments they are executed and what elements are created from the input.

Listing 2 shows a declarative example of ATL. The example rule takes a UML

attribute in line 2 and transforms this attribute into a column of a database in line 4

by naming the column with the attribute name in line 5.

2.2.6

Xpand

Xpand [66] is a statically-typed template language which is available as an Eclipse

plug-in. It is specialized on code generation based on EMF models. Xpand requires

a meta-model, an instance of this meta-model, a template, and a workflow for the

generation. The Xpand language is a domain-specific language (DSL) that has a

small but sufficient vocabulary like FOR and FOREACH for applying templates. An

integrated editor provides syntax coloring, error highlighting, and code completion.

Xpand was originally developed as part of the openArchitectureWare [51] project.

Listing 3 shows an example template for the generation of a Hello World program

in Java. Line 1 imports the -- omitted for shortness -- meta-model classdefinition

which contains a class Model which has an attribute class of type Class. The Class

class has an attribute name. In line 2, a template for the type Model is started.

Afterwards, line 4 expresses that the output of the enclosing FILE tag should be

written to a file named benchmarks/java/

class.name.java. Line 5 to 9 define a

Java class with the name class.name and a main method that prints Hello World

on the console. Line 10 closes the FILE tag and line 11 closes the started template

for the type Model.

Simulation Input

This section describes the input for simulation of a cloud deployment option with

CloudSim including our conducted enhancements. Section 3.1 provides an overview

of the required input parameters. Afterwards, Sections 3.2 to 3.9 describe the re-

quired input parameters and the approaches for deriving them.

3.1

Overview

The input parameters MIPIPS and instruction count are related to instructions.

The MIPIPS serve as a measure for the performance of a CPU. They are described

in Section 3.2. The instruction count of a method serves as an indicator for the work

that has to be conducted by the CPU if the method is called. Section 3.3 describes

three different approaches for the derivation of the instruction count of a method.

Instructions in general can be instructions on a low level machine language like

Assembler, on an intermediate level like Java bytecode, or on the level of a high

level language like Java. For this thesis, we define instructions as a well defined set

of statements that lies between the intermediate level and the high level language

definition. As instructions we define declarations of a variable, assignments with

at least one operation on the right hand side like x = 3 + 2, comparisons, field

accesses, and class creations.

Most of the time when we talk about instructions, we mean integer plus in-

structions. We define an integer plus instruction as the assignment to a variable

and on the right hand side of the assignment two integer types are combined with

a plus statement, e.g., x = y + 3 where y is an integer variable. For simplicity

and shortness, we omit integer plus and simply write instructions, if the meaning is

unambiguous.

Section 3.4 describes weights that are used to convert, for instance, a double

minus instruction to integer plus instructions. In Section 3.5 the size of a data

type or class in bytes is derived.

This is needed for the simulation of network

traffic. Section 3.6 describes the input of an SMM workload profile which is required

for creating the Cloudlets. In Section 3.7 the enriched KDM model is described.

Then, Section 3.8 presents the adaptation rules which are used for starting and

terminating virtual machine instances during runtime. Finally, Section 3.9 describes

the simulation configuration parameters.

3.2

MIPIPS

This section describes what mega integer plus instructions per second (MIPIPS) are

and why we need them. Furthermore, our benchmark for derivation of MIPIPS is

explained in Section 3.2.2.

3.2.1

Description

CloudSim requires MIPS as a measure for the computing performance of virtual

machine instances. However, we consider MIPS as too coarse grained. Most CPUs

need different times for different low level instructions. For example, a division of

two doubles typically takes longer than an addition of two integers on current CPUs.

Furthermore, CloudSim does not suggest how to measure MIPS.

We introduce MIPIPS as the measure for describing the computing performance

and express instructions like double plus as integer plus instructions through a con-

version. Notably, we could have used, e.g., mega double plus instructions per second

(MDPIPS) as the measure for computing performance and normalized all other in-

structions to double plus instructions (see Section 3.4 for details). However, we

wanted an underlying instruction type that is faster than most other instructions

because the conversion factors become more readable. For example, if we would have

used a class creation instruction, mostly all other instructions would be between 0

and 1, and saying that one integer plus can be performed in 0.0004 class creation

instructions is improper.

We do not use already existing benchmarks like Dhrystone (see Section 9.7) or

Cloudstone (see Section 9.8) because we need an easily adaptable to new program-

ming languages benchmark and we later describe an approach for counting instruc-

tions that bases on static analysis which needs an association between statements

and the measure for computing performance.

Our MIPIPS benchmark measures the computing performance of a single core.

Hence, a computer with one core will have the same MIPIPS value as a computer

with 64 cores, if the performance of the one core on the first computer equals the

performance of one core on the second computer. This is motivated by the fact

that a program which is single-threaded is not faster on a computer with 64 cores.

Furthermore, if the program has, e.g., two threads for processing, the performance

depends on the synchronization method used in the program. However, the core

count is also considered in the simulation. CloudSim defines the value TotalMIPS

which is calculated by multiplying the core count with the MIPS. In accordance to

this definition, we define TotalMIPIPS as the product of the core count and the

MIPIPS value.

3.2.2

Derivation

The basic idea for deriving MIPIPS is a benchmark that measures the runtime of a

defined amount of integer plus instructions.

The runtime of a single instruction cannot be measured accurately because mea-

surement techniques like the usage of System.currentTimeMillis() in Java have

a resolution of one millisecond. Even CPU cycle counters are not sufficient accu-

rate. Hence, we use a loop which runs our integer plus instructions at least for ten

seconds on current CPUs. Measuring the runtime of the whole loop would include

more instructions like jumps and comparisons being measured. Therefore, we do a

calibration run (see Listing 4) for running the loop and then do a second run with our

integer plus instructions added to the loop's body (see Listing 5). Afterwards, we

subtract the runtime of the second run from the first run. This reveals the execution

time of the added integer plus instructions.

Our runtime measuring technique is a program that acts as master and starts

the benchmark run in a slave on the same machine. The runtime measurement is

conducted by the slave program due to exclusion of initialization time. After the

execution, the slave returns the measured runtime for the benchmark run to the

master. According to Georges et al. [20], this measurement must be done at least

30 times. Hence optimally, the master starts the slave 30 times for each benchmark.

The number of runs can be configured by parameters or from a GUI (see Section 6

for details). Afterwards, the master calculates the median of the response times.

An important part is the disablement of optimizations for the compiler and

interpreter when the slave program is called by the master program. Depending on

the selected language and optimization settings, the optimization can cause our loop

to have constant runtime.

Listing 4 shows the calibration run in Java. Line 1 declares an integer variable

named x that is incremented by 2 in the loop body at line 7. The variable x is

incremented by 2 because an increment of 1 can be optimized in many languages.

This integer variable is printed to the console in line 14. The purpose of this variable

and printing of it is that the compiler cannot easily omit the loop. Line 3, and 11 to

13 show the applied runtime measurement of the loop. In line 6 to 9 the actual loop

1 i n t x = 0 ;

3 long s t a r t T i m e = System . c u r r e n t T i m e M i l l i s ( ) ;

5 i n t i =

-2147483647;

6 while ( i

< 2147483647) {

x = x + 2 ;

i += 1 ;

}

11 long endTime = System . c u r r e n t T i m e M i l l i s ( ) ;

12 long d i f f t i m e = endTime

- startTime ;

13 System . o ut . p r i n t l n ( d i f f t i m e ) ;

14 System . o ut . p r i n t l n ( x ) ;

Listing 4: Calibration for running the loop without added integer plus instructions

in Java

is displayed. Notably, this is a direct translation from a for loop to a while loop. Our

first approach contained a for loop. However, at least the Microsoft C# compiler in

version 4.0.30319.1 optimizes for loops, though we disabled optimization. With this

compiler, a while loop is not optimized when optimization is disabled.

Listing 5 shows the MIPIPS counter. Compared to the calibration, line 2, 9, and

17 are added. These lines declare a variable y, add 3 to y in the while loop, and

finally print the value of y. y is incremented by 3 because otherwise the compiler

can use the value of x and does not need to calculate y.

For the derivation of the added instruction count of the benchmark, the bench-

mark reads in the instruction count from a comment at the top of the class. Sub-

sequently, the instruction count is divided by the median of the runtime in seconds.

This value is the derived MIPIPS for the platform. Notably, the derivation needs

to be rerun whenever a new software is installed that should act as a permanent

service on the machine because the runtime of the benchmark can be larger due to

the changed workload on the CPU.

Benchmark Generation with Xpand

For supporting easy adaptability for new programming languages, we utilize Xpand

to generate the benchmark for different target languages. Xpand requires a meta-

model definition, an instance of the meta-model, and a language-specific generation

template.

1 i n t x = 0 ;

2 i n t y = 0 ;

4 long s t a r t T i m e = System . c u r r e n t T i m e M i l l i s ( ) ;

6 i n t i =

-2147483647;

7 while ( i

< 2147483647) {

x = x + 2 ;

y = y + 3 ;

i += 1 ;

}

13 long endTime = System . c u r r e n t T i m e M i l l i s ( ) ;

14 long d i f f t i m e = endTime

- startTime ;

15 System . out . p r i n t l n ( d i f f t i m e ) ;

16 System . ou t . p r i n t l n ( x ) ;

17 System . ou t . p r i n t l n ( y ) ;

Listing 5: MIPIPS counter in Java

The meta-model, following the Ecore definition, for representation of a bench-

mark class is shown in Appendix B. It contains the basic elements of an imperative

programing language. It enables the modeling of classes, methods, expressions, vari-

able declarations, loops, class creations, and concrete method calls. Furthermore, it

contains a class for an empty line, which supports the readability of the generated

output. Two special classes are included in the meta-model. These are SystemOut

which represents the statement for printing Strings to the console and MeasureTime

which represents the statement for getting the current value of a time counter. These

two classes are mapped by the generation template to individual statements for each

target language and can be quite different like System.out.println() for Java and

puts for Ruby.

Listing 6 shows the MIPIPS counter in the language independent XML repre-

sentation which is an instance of the class definition meta-model. A generated code

example for Java was already presented in Listing 5 and described before. Hence,

we only describe the special facts about this example in XML representation. The

class definition contains a instructionCount attribute in line 5. This attribute is

included in the concrete language representation as a comment at the beginning of

the class and only required by the master program. From this comment the master

program gets the information about the instruction count of the benchmark which

Details

Pages
Type of Edition: Erstausgabe
Publication Year: 2014
ISBN (PDF): 9783956363528
ISBN (Softcover): 9783954893935
File size: 1.9 MB
Language: English
Publication date: 2018 (June)
Keywords: Cloud Deployment Software Migration Support
Product Safety: Anchor Academic Publishing

Author

Florian Fittkau (Author)

Simulating Cloud Deployment Options for Software Migration Support

Summary

Excerpt

Table Of Contents

Details

Author

Florian Fittkau (Author)