QoS analysis for component-based embedded software: Model and methodology

QoS analysis for component-based embedded software: Model and methodology

The Journal of Systems and Software 79 (2006) 859–870 www.elsevier.com/locate/jss QoS analysis for component-based embedded software: Model and metho...

361KB Sizes 1 Downloads 111 Views

The Journal of Systems and Software 79 (2006) 859–870 www.elsevier.com/locate/jss

QoS analysis for component-based embedded software: Model and methodology Hui Ma *, I.-Ling Yen, Jia Zhou, Kendra Cooper Department of Computer Science, The University of Texas at Dallas, Mail Station 31, P.O. Box 830688, Richardson, TX 75083-0688, USA Received 6 December 2004; received in revised form 27 September 2005; accepted 3 October 2005 Available online 9 November 2005

Abstract Component-based development (CBD) techniques have been widely used to enhance the productivity and reduce the cost for software systems development. However, applying CBD techniques to embedded software development faces additional challenges. For embedded systems, it is crucial to consider the quality of service (QoS) attributes, such as timeliness, memory limitations, output precision, and battery constraints. Frequently, multiple components implementing the same functionality with different QoS properties (measurements in terms of QoS attributes) can be used to compose a system. Also, software components may have parameters that can be configured to satisfy different QoS requirements. Composition analysis, which is used to determine the most suitable component selections and parameter settings to best satisfy the system QoS requirement, is very important in embedded software development process. In this paper, we present a model and the methodologies to facilitate composition analysis. We define QoS requirements as constraints and objectives. Composition analysis is performed based on the QoS properties and requirements to find solutions (component selections and parameter settings) that can optimize the QoS objectives while satisfying the QoS constraints. We use a multi-objective concept to model the composition analysis problem and use an evolutionary algorithm to determine the Pareto-optimal solutions efficiently.  2005 Elsevier Inc. All rights reserved. Keywords: Embedded software; Component composition; Pareto-optimal; Quality of service (QoS); Evolutionary algorithm

1. Introduction Recent advances in hardware technology have dramatically improved hardware productivity and made it economically feasible to extend the reach of automation to a wide variety of services, such as intelligent vehicles, patient monitoring systems, handheld devices, and sensor networks. However, the lack of commensurate gains in software productivity is a major hurdle in developing more sophisticated embedded applications. This is unfortunate since software is crucial to the successful realization of many application systems. Domain-specific knowledge is *

Corresponding author. Tel.: +1 9728836701; fax: +1 9728832349. E-mail addresses: [email protected] (H. Ma), ilyen@utdallas. edu (I.-Ling Yen), [email protected] (J. Zhou), kcooper@utdallas. edu (K. Cooper). 0164-1212/$ - see front matter  2005 Elsevier Inc. All rights reserved. doi:10.1016/j.jss.2005.10.001

usually embodied in the software. Also, software is frequently expected to enhance the robustness of application systems by monitoring the environment and adapting the system to tolerate hardware failures, network congestion, and security attacks. To enhance the productivity of developing complex applications, software technology is rapidly shifting away from low-level programming issues to automated code synthesis and the integration of systems from components. Component-based development (CBD) techniques can significantly reduce software development time and cost, which can benefit the software development process for embedded systems as well as other application domains (Buck et al., 1991; Borriello et al., 1995; Ommering et al., 2000; Wall et al., 2002). However, CBD approaches for embedded software systems face additional challenges due to the stringent QoS requirements for these systems.

860

H. Ma et al. / The Journal of Systems and Software 79 (2006) 859–870

For example, it is crucial to consider real-time, security, reliability, and resource and power constraints in embedded systems. Thus, the integration of embedded systems from components must consider the satisfaction of the functional requirements as well as the QoS requirements. Frequently, multiple components with different QoS tradeoffs can be used to achieve the same functionality. Also, components may be configurable, i.e., some of the program parameters of a component can be configured to achieve different QoS tradeoffs (Cooper et al., 2003). It can be computationally intensive to determine the most suitable set of components to use with the best parameter settings. For example, consider a small system consisting of 10 program units. Assume that there are two possible components that match the functional requirements of each program unit. Also, assume that each component has a single parameter with 10 potential settings to achieve various QoS tradeoffs. In an exhaustive search, there are (2 · 10)10 choices to be considered in order to find out the choices for a satisfactory QoS property of the system. Thus, an efficient and effective decision making mechanism is needed for QoS analysis in a CBD approach. Previous research work regarding software component selection for satisfying QoS specification is limited. An NFR-Assistant tool has been presented (Tran and Chung, 1999) that assists with the non-functional (QoS) requirement analysis and exploration of design alternatives through a graphical interface. The approach can be used for automated design decision-making, but it is based on exhaustive search and only considers one non-functional attribute at a time. Conklin and Begeman (Conklin and Begeman, 1988) and Lee (Lee, 1991) have done similar work in component selection. Their system records manual design decisions for future reference. The real-time requirement has been addressed (Steigerwald, 1993) and tools have been provided for the selection of components satisfying real-time constraints. The approach uses an exhaustive search and only considers real-time aspects. The tradeoff problem for a software agent pipeline system has been formulated (Yen and Chen, 1997) and the expression for optimized time–quality attributes has been derived. However, a continuous quality and time function is considered which is not always applicable for the selection of components and their parameter settings. A component selection approach has been presented to resolve multi-criteria optimization problem (Wallnau et al., 2001). The approach uses a weighted sum equation to give each component an overall utility value based on multiple criteria. However, the component selection based on this approach does not consider the integration effect, which may produce more tradeoff between multiple criteria of composed system. A number of approaches are available in the literature that considers QoS issues in component-based software development. Most of them focus on prediction of system QoS properties. Hissam et al. (2001) presented a prototype prediction-enabled component technology (PECT) to integrate a software component technology with one or more

analysis technologies. The prediction of end-to-end latency of an assembly has been provided as an example. The theoretical and empirical validities of prediction show that the composition analysis is reasonable. The prediction of other properties, such as resource consumption (Muskens and Chaudron, 2004) and reliability (Hamlet et al., 2001), also has been presented. These works form a solid foundation of composition analysis and can be embedded in our model. In this paper, we present a model for composition analysis based on synchronous data flow (SDF) (Lee and Messerschmitt, 1987a). In our model, the system specification includes the modules that compose the system, the data flow among the modules, and the QoS requirements of the system. A module is a virtual unit defined by functional requirements. Each module can be instantiated by some configurable components that all satisfy the functional requirements of the module. Based on the system specification, the composition analysis determines which components can be used and how to configure the selected components to optimize the QoS objectives. We formulate the composition analysis as a multi-objective optimization problem and use an evolutionary algorithm to solve it. Pareto-optimal solutions are obtained to determine suitable component selections and parameter settings. The remainder of this paper is organized as follows. In the next section, we introduce the system model that forms the basis of composition analysis. In Section 3, the composition analysis problem, formulated as a multi-objective problem with Pareto-optimal solutions, is presented. Section 4 discusses the evolutionary algorithm used for composition analysis. A simulation of QoS analysis is then presented in Section 5. Section 6 finally states the conclusion of the paper and presents future research directions. 2. Model for composition analysis We are developing tools and techniques to assist embedded software development. The architecture of a repository-based embedded software development platform is shown in Fig. 1. Each block in the diagram involves a major tool set and technique. An online repository for embedded software (ORES) (Yen et al., 2001, 2002; Gao et al., 2006) forms the foundation of the system. It provides effective component retrieval and sophisticated com-

System Designer

Composition Interface (modules, data flow, QoS requirements) Component Identifier

Composition Analyzer

Code Generator

Component Parameterizer

ORES (Components) Fig. 1. Repository based embedded software development platform.

H. Ma et al. / The Journal of Systems and Software 79 (2006) 859–870

ponents descriptions, including functional and QoS properties of the components. Many components in the repository are configurable components. Also, we have developed a Component Parameterizer Tool Set (Cooper et al., 2003) which can be used to parameterize the components in ORES. The QoS properties in terms of various settings of the configurable parameters of a component are measured and stored in the repository for future analysis. When developing an application, the designers interact with the Composition Interface to prepare the system specification, which includes the modules, the data flow among modules, and QoS requirements of the system. Based on the system specification, various components satisfying the functional requirements are identified by the Component Identifier. The Composition Analyzer tool set performs QoS analysis based on the QoS requirement specification and the QoS properties of the individual components. The set of components to be used and the settings of their configurable parameters are determined from composition analysis so that the system requirements can be satisfied. Finally, the code generator generates the glue code and composes the system from the selected components. In this paper, we focus on the Composition Analyzer. 2.1. System specification model We consider a synchronous dataflow (SDF) model for specifying and designing an embedded system from existing components (Lee and Messerschmitt, 1987a). Dataflow models handle regular computations that operate on streams, which are popular in signal processing system specifications. Each process in a dataflow model is constructed as a sequence of atomic actors. An actor has an interface, which includes communication ports and parameters that are used to configure the function of the actor. With the synchronous feature, the SDF model is predictable and can be scheduled statically. Thus, it is extremely useful for the formalism of embedded real-time software. Our composition specification is based on the SDF model. We replace the notion of actors by a virtual functional unit, called a module. A module is specified by its functional requirement and the descriptions for each data input and output. It has to be instantiated by an actual component in the repository. Since there may exist multiple components that can satisfy the functional requirement of a module, the module can be instantiated by any of them based on the desired QoS properties or other considerations. Using a module instead of the actual component in the composition specification yields flexibility and reusability of the system specification and provides the potential for system reconfiguration. Definition 1 (System Specification Model). The system specification model is based on the SDF model and defined as G = (X, E, C) where:

861

• X = {xlj1 6 l 6 L} is the set of L modules (actors). Each module xl in X is defined by a functional requirement with which the actual components in the repository can be selected to instantiate it (Yen et al., 2001). • E is the set of directed edges that denote the data flows among the modules and the flow directions. The number of data blocks produced or consumed by each I/O port (input/output port of the data flow) is also defined on E (Lee and Messerschmitt, 1987b). • C is the set of overall QoS requirements of the system, which will be elaborated later in Definition 2. A system designer can prepare the system specification G that defines how to compose the components in the SDF model to achieve a goal function. The components can be selected from the repository and associated with the modules in X. Given the specification of a system G, the set of components Cl = {cl,mj for all m} are identified, where cl,m is the mth component that can satisfy the functional requirement of module xl in X. We define the set of all components that instantiate the system as C = {Clj1 6 l 6 L}. Note that dependency between some components may affect component selection and configurable parameter setting. For example, an encryption function may have to be used with its corresponding decryption function together. ORES allows the specification of component dependencies (Yen et al., 2001), and these dependency constraints will be observed during component selection. Example 1. Fig. 2 shows a simple system specification example G. In this example, G contains 3 modules x1, x2 and x3. The edges connecting I/O ports and representing the data flow between modules are as shown in the figure. The number of data blocks produced and consumed at each port is specified with the port. The components that can instantiate the modules are identified and associated with the modules through the composition interface. For example, the components that can instantiate module x1 are c11, c12, and c13. Each component has a set of configurable parameters, which will be described later (in Section 2.2).

Parameter Set

Parameter Set

Parameter Set

c11

c12

c13

System Specification ω1

2

4

ω2

1

2

ω3

Fig. 2. A simple example of specification.

862

H. Ma et al. / The Journal of Systems and Software 79 (2006) 859–870

2.2. QoS attributes and properties

2.3. QoS Requirements, objectives, and constraints

We consider each measurable system property as a QoS attribute. For example, in an audio processing system, there may exist attributes such as execution time, memory requirement, and voice quality. Let A = (a1, a2, . . . , aN) denote the vector of N QoS attributes for a system specified by G, where each ai in A is a measurable QoS attribute that has quantifiable values. A component can be described by its functional specifications and QoS property. The QoS property of a component is defined by the measurements of its QoS attributes. Let Ac denote the set of QoS attributes of a component c. Assume that c is a component selected to instantiate a module in G. Generally, Ac is a subset of A. Some QoS attributes of the system may have nothing to do with some of the individual components that compose the system. Also, a QoS attribute that is not relevant to G needs not be considered. In this paper, we consider Ac = A for simplicity. The QoS property of component c can be represented by a vector V c ¼ ðv1c ; v2c ; . . . ; vNc Þ, where vic is the QoS measurements of c in terms of attribute ai. If c does not have a certain measurable attribute ai, a NULL value is assigned to vic . The QoS property Vx of a module x is defined to be the QoS property of the component that instantiates it. In a similar way, the QoS properties V of the system G are defined to be the measurements of the QoS attributes in A. The QoS properties of a system can be derived from the QoS properties of the components that compose the system. The QoS measurements of a system G can always be determined by its input domain, configurable parameters, and execution environment. For example, input size of an image processing program can greatly impact the execution time and memory requirement. The execution environment, such as processor speed, memory size, and network capacity, can also impact the QoS measurements. Here, we focus on the impact of configurable parameters and assume that the execution environment and the input domain are given. A configurable parameter is a parameter in a component that, when adjusted, can impact the measurements of one or more of the QoS attributes. Different QoS properties of a component and the composed system can be obtained by tuning the configurable parameters. The configurable parameters of each component form a K-dimensional parameter set X = (x1, x2, . . . , xK) (where K is the total number of configurable parameters). The QoS property vic for attribute ai of a component c can be measured and plotted against the K-dimensional parameter space. Let fci ðX c Þ denote a property function of component c in term of QoS attribute ai, where Xc is the parameter set of the component c. A property function fci ðX c Þ is the relation that maps the values of the parameter set Xc to a unique value of vic for attribute ai. For a system G, we define property function set as F = {Fcjc 2 C} which is the set of property functions of all components that comprise G.

As described in Definition 1, a system specification includes the QoS requirements specification C. For embedded systems, it is crucial to consider the QoS requirements in terms of the QoS attributes. For example, the memory constraint can be defined on the memory attribute of the system, and the requirement to maximize the computation precision is defined on the precision attribute of the system. We divide C into two types of requirements: constraints and objectives, according to the requirements for being satisfied or optimized respectively. In our model, each objective or constraint is based on one or more QoS attributes. They are defined in the following. Definition 2 (QoS requirements). For a system specification G, the set of QoS requirements is defined as C = (O,R), where O = (o1, o2, . . . , oJ) is the set of objectives, and R = (r1, r2, . . . , rI) is the set of constraints. The objective oj, 1 6 j 6 J, is an optimization function denoted as Djnj(V) where Dj is an optimization operator, such as maximize or minimize, and nj(V) is an objective function over QoS property V. Each QoS objective is to optimize the system QoS properties over one or more QoS attributes. A constraint ri is an inequality defined over V specifying the bound for the system QoS property. It can be expressed as follows: fi ðV ÞDi ui ð1 6 i 6 IÞ;

ð1Þ

where fi is the constraint function of G in terms of QoS properties V, Di is the comparison operator, and ui is the bound for fi(V). The goal of the system is to satisfy all the QoS constraints while obtaining optimal solutions in terms of the QoS objectives. All objective functions n and constraint functions f of the system specification G are in terms of QoS attributes A of G and can be computed from the QoS properties V of G. Example 2. Consider an IP phone system. Assume that the execution time bound and the memory limitation are the constraints that must be satisfied. This system may include components such as a voice codec and echo canceller. The QoS attributes for a voice codec may be execution time, required memory, and perceptual speechquality measurement (PSQM). Here PSQM is defined in ITU-T Recommendation P.861 to measure the clarity of the voice. The QoS attributes for echo canceller may include the execution time, required memory, and Terminal Coupling Loss (TCL), an echo canceling quality measure. So they have different QoS attribute sets although they are in the same system. We can define the QoS attribute set for the system as A = (a1, a2, a3, a4) = (execution time, required memory, PSQM, TCL). The QoS attribute sets for the voice codec and echo canceller are the same but the TCL value will be set as NULL for voice codec and the PSQM value will be set as NULL for echo canceller. Also, the

H. Ma et al. / The Journal of Systems and Software 79 (2006) 859–870

voice quality and echo canceling quality can be set as the QoS objectives of the whole system. 2.4. Aggregation The system QoS properties are the cumulative effect of the QoS properties of the components that instantiate the modules of the system. Here we express QoS properties of a system G in terms of the QoS properties of the glue code cg and the selected components {clj1 6 l 6 L}, where cl instantiate the module xl of the system. Definition 3 (Aggregate Function). The aggregate function ri is for computing the composition effect of the properties of multiple modules of the system specification G. It is presented as vi ¼ ri ðvig ; vi1 ; vi2 ; . . . ; viL Þ, 1 6 i 6 N, where L is the number of the modules, N is the number of the attributes, vi is the QoS property of G in terms of attribute ai, vig is the property of the glue code in terms of attribute ai, and vil is the property of module xl in terms of attribute ai. For a system specification G, we define the aggregate function set as R{rij1 6 i 6 N}. Aggregate Function computes the QoS properties of a system from the QoS properties of the modules. Notice that a module is a virtual unit. Its QoS properties are determined by the component that instantiate it. In most situations, we can consider the glue code as a small component, so the aggregate function can be presented as vi ¼ ri ðvi1 ; vi2 ; . . . ; viL Þ. Example 3. Consider a system specification G with three measurable constraints (r1, r2, r3). Assume they are in terms of QoS attributes a1 execution time, a2 the persistent memory required, and a3 the scratch (temporary) memory required respectively. Assume that system specification only contains sequential statements, no loop or branch constructs, and the glue code is considered as small components. The system QoS attribute set A = (a1, a2, a3). Then, we have X v1l ; f1 ¼ v1 ¼ r1 ðv11 ; v12 ; . . . ; v1L Þ ¼ l¼1...L 2

2

2

f ¼v ¼r

ðv21 ; v22 ; . . . ; v2L Þ

¼

X

v2l

and

ð2Þ

l¼1...L

The aggregate function is highly application dependent. For some common models, standard aggregate functions can be defined. For example, in the SDF model, the aggregate function involves the computation of a schedule. Based on the selected components, a schedule can be determined by the algorithm discussed in (Buck, 1993). With a fixed schedule the aggregate system QoS properties, such as end-to-end latency and memory requirement, can be computed. 3. Composition analysis problem specification In general, an embedded system has multiple QoS objectives and multiple constraints. As shown in Example 2, the IP phone system has requirements to optimize voice and echo canceling quality, and constraints on the execution time and the required memory. Thus, during the composition analysis, the system configuration problem is mapped to a constrained multi-objective optimization. Given the specification of a system G = (X, E, C), the set of components C that can satisfy the functional requirements of each module in X are identified. We assume that each component c 2 C is parameterized and has configurable parameter set Xc. Special mechanisms to make a component parameterizable have been discussed (Cooper et al., 2003). The configurable parameters of the entire system can be defined as X = {Xcjc 2 C}. Let F be the set of property functions and R the aggregate functions of G. Notice that C, X, F, R are known composition factors of G for the composition analysis. Let T denote all these composition factors of G, where T = (C, X, F, R). The composition analyzer can analyze the composition based on the QoS constraints R and objectives O of G with given T. The goal of composition analysis is to determine the best selection of components to instantiate the modules and find the best configurable parameter settings to the components to optimize the QoS objectives of the system G. The solution has to be feasible, i.e., the solution has to satisfy constraints in R. Based on the feasible solutions, the goal is to get the optimal benefits in terms of QoS objectives in O. Here we discuss the multi-objective problem of the composition analysis. 3.1. Multi-objective optimization problem for composition analysis

f3 ¼ v3 ¼ r3 ðv31 ; v32 ; . . . ; v3L Þ ¼ max v3l . l¼1...L

So according to Formula (1) (given in the previous section), the QoS constraints R of G can be presented by the QoS properties v(ai) of the modules using the aggregate function as the following: X v1l 6 u1 ;

Here we formally defined the multi-objective optimization problem for composition analysis for a given system specification G. Definition 4 (Multi-Objective Optimization for Composition Analysis).

l¼1...L

X

v2l 6 u2

and

l¼1...L

Maxl¼1...L v3l 6 u3 .

863

ð3Þ

optimize O = (o1,o2, . . . , oJ) (J P 2) subject to QoS property set: V = (v1,v2, . . . , vN) Constraints R: fi(V)Di ui (1 6 i 6 I)

864

H. Ma et al. / The Journal of Systems and Software 79 (2006) 859–870

Objectives O: oj = Djnj(V) (1 6 j 6 J) Each QoS property: vi ¼ ri ðvi1 ; vi2 ; . . . ; viL Þ Component selection: 1 6 sel(xl) 6 Ml where sel(xl) is the index of the selected component PMl Module QoS property: vil ¼ m¼1 ð:ðselðxl Þ  mÞ  1; if x ¼ 0 i ðX l;m ÞÞ where :ðxÞ ¼ fl;m 0; otherwise Our goal is to optimize all the objectives O of G. Note that O is the objective set of the system with J objectives and G has I constraints. As defined in Section 2.3, both objectives and constraints of the system are in term of the QoS attributes A. As described in Section 2.2, QoS property of the components vil;m can be derived from i the property function fl;m and configurable parameters Xl,m. Then the module QoS property vil can be determined by component selection bl and the QoS property of the selected component. As defined in Section 2.4, the QoS property of the system vi can be computed from the QoS properties of the modules with the aggregate function ri. 3.2. Pareto-optimal solutions Optimizing a system with multiple objectives may result in conflicts (Steuer, 1986). The tradeoff between the multiobjectives must be considered and a decision can be made based on it. Frequently the decision can be left to the system designer with the help of the composition analysis in a form of a solution set. Pareto-optimal solutions (Steuer, 1986; Chankong and Haimes, 1983) are a commonly used model for multi-objective optimization problems. In our model, the goal for composition analysis is to find the feasible Pareto-optimal solutions. For a problem with J objectives (o1, o2, . . . , oJ), a solution s ¼ ðo1s ; o2s ; . . . ; oJs Þ dominates another solution s0 ¼ ðo1s0 ; o2s0 ; . . . ; oJs0 Þ if both of the following conditions are satisfied: • s is no worse than s 0 in any attributes, • s is strictly better than s 0 in at least one attribute. It can be denoted as s  s 0 or s 0  s. A solution s is defined as covering another solution s 0 if s is no worse than s 0 in any attributes. It can be denoted as s ¤ s 0 or s 0 ^ s. If a solution s cannot be dominated by another solution s 0 , it can be said that s is non-dominated by s 0 . If a solution s is non-dominated by all other solutions in a solution set B, it is called the Pareto-optimal solution in B. The set of all the non-dominated solutions of B is called the Pareto-set of B. Recall that the first goal of the composition analysis mentioned in Section 3.1 is to satisfy the QoS constraints of the system specification and then to optimize the QoS attributes, so the goal of the composition analysis is to find the Pareto-set within the feasible solutions.

Example 4. Consider a composition specification G defined by two modules (x1, x2) with two measurable objectives (o1, o2). Here, o1 is the execution time and o2 the persistent memory required. Assume that x1 and x2 can be instantiated by two components each, namely, (c11, c12) and (c21, c22), respectively, and each component has one solution as ðo111 ; o211 Þ, ðo112 ; o212 Þ and ðo121 ; o221 Þ, ðo122 ; o222 Þ. Also, assume that the code for c11, c12, c21 and c22 only contain sequential statements, with no loop or branch constructs. Then we have four solutions for the specification which are ðo111 þ o121 ; o211 þ o221 Þ, ðo111 þ o122 ; o211 þ o222 Þ, ðo112 þ o121 ; o212 þ o221 Þ and ðo112 þ o122 ; o212 þ o222 Þ. Pareto-optimal solutions are searched in these four solutions. Once the final solution is found, the selected components are also determined. For example, if ðo112 þ o121 ; o212 þ o221 Þ is the final decision, c12 and c21 are selected to instantiate the system. 4. Evolutionary algorithm for composition analysis In this section, we discuss how to apply an evolutionary algorithm to the composition analysis problem discussed in Sections 2 and 3. The goal of the composition analysis is to find Pareto-optimal solutions of the system based on multiple objectives O while satisfying the QoS constraints R (as defined in Definition 2). Here we map this problem to the evolutionary algorithm paradigm. 4.1. Evolutionary algorithm overview First we introduce the basic evolutionary algorithm and its operations. The general steps of an evolutionary algorithm are given as follows: Step 1: Randomly generate an initial population P. Step 2: Create a new population P 0 . Step 3: Randomly select two individuals as parent in P, generate two children by recombination, put children in P 0 with the recombination probability, otherwise put parent in P 0 , and repeat this step till the size of P 0 equals the size of P. Step 4: For each individual in P 0 , apply mutation operation with the mutation probability. Step 5: P = P 0 . Step 6: Repeat Steps 2–5 till the termination condition satisfied. The population is a set of individuals that represents a system solution. The individual format depends on the specific situation where the evolutionary algorithm is used. We present our individual representation in Section 4.2. New individuals are generated by mutation and recombination operations that are described in Sections 4.3 and 4.4, respectively. In this paper, we run the evolutionary algorithm for a fixed number of rounds and, hence, the termination condition is not needed.

H. Ma et al. / The Journal of Systems and Software 79 (2006) 859–870

4.2. Individual representation Each individual in the population represents a solution of the embedded system, which includes the component selections and the configurable parameter settings. Fig. 3 illustrates how an individual solution is represented in a hierarchical view. The vector consists of rep(xl), for all l, which represents the characteristics of the set of modules that comprise the system (as shown in Fig. 3a). Each rep(xl) is further represented by sel(xl) and rep(cl,m), 1 6 m 6 Ml, where rep(cl,m) is the characteristics of the component cl,m and sel(xl) is the index of the component selected to instantiate the module (as shown in Fig. 3b). The value for sel(xl) is in the range [1, Ml] where Ml is the number of components. If there is only one component to instantiate xl, then sel(xl) is not needed and rep(xl) is the same as rep(cl,1). If the module has more than one candidate component, then rep(cl,m), for all m, are listed one by one after sel(xl). A component representation, rep(cl,m), is further characterized by its configurable parameters xi(cl,m) (as shown in Fig. 3c). Configurable parameters may have different data types, such as float, integer, and discrete values. To facilitate mutation value selection, we require each configurable parameter to have a fixed range. For configurable parameters without lower and upper bounds, a relatively large range is assigned. Due to the inherited limitations imposed by hardware on the data values, these extended bounds have no significant impact on the evolutionary process. Note that all the component selection parameters sel(xl) and configurable parameters xi(cl,m) are basic elements in an individual representation and are called the genes of the individual. In Example 4, we present a system composed of two modules (x1, x2), which can be instantiated by (c11, c12) and (c21, c22), respectively. The individual of this system is illustrated in Fig. 4. Here we assume each component cl,m only has one parameter x1(cl,m). So for module x1, the individual representation has component selection parameter sel(x1) and two component configurable parameters

rep(ω 1)

rep(ω 2)

865

x1(c1,1) and x1(c1,2) for its two candidate components c1,1 and c1,2, respectively. The parameters for module x2 are similar to the ones of x1 and are listed after them in the individual representation. Fig. 5 shows an example representation of a component (a partial representation of an individual). For component cl,m, we have two floats, one integer, and two discrete-value parameters. The bounds for float and integer parameters are listed in Fig. 5b. For example, 2.0 6 x1(cl,m) 6 8.7. The data type of discrete-value parameter is not limited to real number. It can be any type of data that can be listed. The parameter represented in the individual is only the index of the value in the data list. For example, x4(cl,m) = 27.45 and x5(cl,m) = 3 although the value given in the individual are 2 and 1, respectively. 4.3. Mutation In a mutation process, a new individual is generated from a randomly picked individual by modifying it slightly. The probability of mutation of one individual is controlled

(a)

(b)

(c) Fig. 5. Component representation example. (a) A notation and data of representation of component cl,m, (b) bounds for float and integer parameters and (c) data lists for discrete-value parameters.



rep(ω L)

(a) rep(cl,1)

[sel(ω l)]

[rep(cl,2)]



[rep(cl,Ml)]

(b) x1(cl,m)

x2(cl,m)



xK (cl,m)

(c) Fig. 3. Individual representation. (a) Whole individual representation of the system with L modules xl (1 6 l 6 L), (b) the representation of module rep(xl) (1 6 l 6 L) and (c) the representation of component rep(cl,m) (1 6 l 6 L, 1 6 m 6 Ml).

sel(ω1)

x1(c1,1)

x1(c1,2)

sel(ω2)

x1(c2,1)

Fig. 4. Individual representation for the system in Example 4.

x1(c2,2)

866

H. Ma et al. / The Journal of Systems and Software 79 (2006) 859–870

by a mutation rate, which is normally set as 1/K where K is the population size. Here we set it as 0.01 for our experimental study. For the individual to be mutated, the mutation point is chosen randomly and the gene of the mutation point is modified. We use a random value in a predetermined range to replace the existing gene. An individual has sel(wl) and configurable parameters as its genes. sel(wl) has clear upper and lower bounds. As described in Section 4.2.1, each configurable parameter has a fixed range. Thus, we define the mutation function mu(x) as follows: x0 ¼ muðxÞ ¼ U ðxl ; xu Þ;

ð4Þ

where x is a gene in the individual (x can be sel(wl) or a configurable parameter), x 0 is the new gene value, xl and xu are the lower and upper bounds of x, respectively, and U(xl, xu) is a uniform random value between xl and xu. Since the genes include both configurable parameters and component selection parameters, both of them can be operated by the mutation function. 4.4. Recombination While mutation generates a new individual from one parent, the recombination process exchanges the genes between more than one parent to reproduce new individuals. We use one-point recombination due to its effectiveness and simplicity. Similar to the mutation point, the crossover point for recombination of two individuals is generated randomly. The one-point recombination process exchanges the genes of two parents on and after the crossover point to reproduce two offspring. Fig. 6 shows an example of the one-point recombination operating on two individuals (s1 and s2). We only show the exchange of rep(xl) where the crossover point locates. The crossover point locates on the configurable parameters of the second component of module xl. Thus, the factors from x2(cl,2) of two individuals are exchanged to get two new offspring. Here the parameters of component cl,2 are broken to two parts and exchanged. Other component parameters listed after cl,2 are exchanged entirely. The probability of recombination of two individuals is generally bounded. We set it to 0.8 in our model.

4.5. Applying evolutionary algorithm for composition analysis problem Based on the mapping of composition analysis problem to general evolutionary algorithm paradigm described in the Sections 4.1–4.3, we need to choose a proper algorithm for resolving multi-objective optimization problem. Several up-to-date multi-objective evolutionary algorithms, such as, the improved version of Non-dominated Sorting Genetic Algorithm (NSGA-II) (Deb et al., 2002), Paretoarchived Evolution Strategy (PESA) (Corne et al., 2000), and the improved version of Strength-Pareto Evolutionary Algorithm (SPEA2) (Zitzler et al., 2001), have significant performance improvements comparing with previous approaches (Khare et al., 2003). Since NSGA-II has better performance in time complexity, diversity preservation, and constraint satisfaction (Deb et al., 2002; Khare et al., 2003), we choose it to resolve the composition analysis problem. In the system requirements, there are constraints (R) as well as objectives (O). Thus, we need to also consider R in the evolutionary process. We use the constraint handling approach discussed in (Deb et al., 2002). It modifies the definition of ‘‘dominate’’ to ‘‘constrained-dominate’’. A solution s is said to constrained-dominate a solution s 0 if any of the following conditions are true: (a) solution s is feasible and solution s 0 is not, (b) solutions s and s 0 are both infeasible, but solution s has a smaller overall constraint violation, or (c) solution s and s 0 are feasible and solution s dominates solution s 0 . We replace the operation ‘‘dominate’’ by ‘‘constrained-dominate’’ in the algorithm. The overall constraint violation of a solution in condition (b) is computed by adding the normalized distance between solution value and constraint for each violated constraint. With this constraint handling approach, the result solutions may converge to satisfy the QoS constraints (R). But in some cases, the evolutionary algorithm cannot obtain feasible solutions. The algorithm then presents the solutions nearest the constraint bounds and provides options for the user to choose, such as rerun the algorithm with more generation rounds, relax the QoS constraints, or insert more candidate components. When the solutions satisfy the QoS constraints, the result of the composition analysis is a small set of feasible

(a)

(b) Fig. 6. One-point recombination. (a) Before recombination: (b) After recombination on crossover point.

H. Ma et al. / The Journal of Systems and Software 79 (2006) 859–870

Pareto-optimal solutions. Comparing to the original exponential exploration space, the result set is much smaller. This set can be further reduced to a fixed number of solutions using the crowding-distance rating. The solutions with highest or lowest value of each objective are assigned an infinity crowding-distance value. The crowding-distance values of all other intermediate solutions are the amount of the absolute normalized differences between the objective values of two adjacent solutions for each objective. On each round, the algorithm removes the individual with the smallest crowding-distance. This step can be repeated until the preset size of result is reached. 5. QoS analysis simulation In this section, we go through the whole composition analysis process using an example system. Our goal is to illustrate the composition analysis process as well as to evaluate the effectiveness of the genetic algorithm for QoS analysis in a large design space. Consider a system G composed of 9 modules xl, 1 6 l 6 9. Some of the modules in G can be instantiated by multiple components. The system QoS requirements of G, C, include two QoS objectives O = (o1, o2) and two constraints R = (r1, r2), which are defined on four QoS attributes A = (a1, a2, a3, a4). The components are configurable, i.e., each has a set of configurable parameters. The impact of the configurable parameters on the component QoS properties is represented by selected functions. We use the two-objective test functions introduced in (Zitzler et al., 2000, 2001) as the simulated QoS property functions. Table 1 lists the selected test functions. Each function is defined on a set of K configurable parameters, xi, 1 6 i 6 K. In Table 1, the domain column specifies the domain of the K parameters. Note that we choose the same

domain range for all the configurable parameters for each test function. For each component, we choose 4 test functions to simulate its QoS property functions (one for each QoS attribute). The number of configurable parameters jXl,mj and j the property functions fl;m (1 6 j 6 4) of a component cl,m are shown in Table 2. The ranges of the configurable parameters are the same as the domain of the functions assigned to the components. For discrete-value parameters, some examples of data lists are shown in Table 3. For example, the component c3,1 is assigned QV and KUR with domain [5, 5]. Its second discrete-value parameter x3(c3,1) has list data set {4.8, 0.2, 1.1, 1.9, 2.8, 4.0}. Table 4 shows the aggregate functions of the system G based on the nine modules xl, 1 6 l 6 9. Each property vi (1 6 i 6 4) of G can be derived from all or part of the QoS properties vil of modules xl. We focus on three operators here: addition, multiplication, and max. Addition is a general operator in composition, such as in execution time and required memory. Multiplication is frequently applicable for some quality attributes represented as rates. For example, the compression rate of two consequent compression components should be multiplied to obtain the total compression rate of the composed system. The max operator is used for the situation where the QoS properties of the system are the largest one of the modules, such as the execution time in the parallel computation environment, the temporary required memory. By using these aggregate functions, the QoS properties of the system can be derived from the QoS properties of the components. The objective functions nj of oj, 1 6 j 6 2, and the constraint functions fi of ri, 1 6 i 6 2, for G are presented in Table 5. We set both QoS objectives as minimization problems and each QoS constraint is an upper bound for a QoS attribute.

Table 1 Test functions for simulating the property functions Name

Domain K

867

Test functions P f1 ¼ Ki¼1 ðjxi j0:8 þ 5 sin3 ðxi ÞÞ þ K PK1 ð1  expð0:2ðx2i þ x2iþ1 Þ1=2 ÞÞ f2 ¼ i¼1 P f1 ¼ ðð1=nÞ  Ki¼1 ðx2i  10 cosð2pxi Þ þ 10ÞÞ0:25 P f2 ¼ ðð1=nÞ  Ki¼1 ððxi  1:5Þ2  10 cosð2pðxi  1:5ÞÞ þ 10ÞÞ0:25

KUR

[5, 5]

QV

[5, 5]K

ZDT1

[0, 1]K

f1 = x1 P g ¼ 1 þ 9  Ki¼2 xi =ðK  1Þ f2 = g Æ (1  (f1/g)1/2)

ZDT2

[0, 1]K

f1 = x1 P g ¼ 1 þ 9  Ki¼2 xi =ðK  1Þ f2 = g Æ (1  (f1/g)2)

ZDT3

[0, 1]K

f1 = x1 P g ¼ 1 þ 9  Ki¼2 xi =ðK  1Þ f2 = g Æ (1  (f1/g)1/2  (f1/g) Æ sin(10pf1)) + 1

ZDT6

[0, 1]K

f1 = 1  exp(P  4x1) Æ sin6(6px1) 0:25 g ¼ 1 þ 9  ðð m i¼2 xi Þ=ðm  1ÞÞ f2 = g Æ (1  (f1/g)2)

868

H. Ma et al. / The Journal of Systems and Software 79 (2006) 859–870

Table 2 Configurable parameters and attribute functions of the components cl,m c1,1 c1,2 c1,3 c2,1 c2,2 c3,1 c4,1 c4,2 c5,1 c5,2 c5,3 c6,1 c6,2 c7,1 c7,2 c8,1 c8,2 c9,1 c9,2

jXl,mj 3 4 3 4 5 4 3 2 2 5 3 2 2 4 2 3 2 2 5

Property functions for cl,m

QoS requirements

1 fl;m

2 fl;m

3 fl;m

4 fl;m

ZDT3 Æ f1 ZDT1 Æ f1 ZDT1 Æ f1 ZDT3 Æ f1 ZDT3 Æ f1 QV Æ f1 QV Æ f1 QV Æ f1 QV Æ f1 ZDT3 Æ f1 ZDT3 Æ f1 ZDT1 Æ f1 QV Æ f1 QV Æ f1 ZDT3 Æ f1 QV Æ f1 QV Æ f1 QV Æ f1 ZDT3 Æ f1

ZDT3 Æ f2 ZDT1 Æ f2 ZDT1 Æ f2 ZDT3 Æ f2 ZDT3 Æ f2 QV Æ f2 QV Æ f2 QV Æ f2 QV Æ f2 ZDT3 Æ f2 ZDT3 Æ f2 ZDT1 Æ f2 QV Æ f2 QV Æ f2 ZDT3 Æ f2 QV Æ f2 QV Æ f2 QV Æ f2 ZDT3 Æ f2

ZDT6 Æ f1 ZDT2 Æ f1 ZDT2 Æ f1 ZDT6 Æ f1 ZDT6 Æ f1 KUR Æ f1 KUR Æ f1 KUR Æ f1 KUR Æ f1 ZDT6 Æ f1 ZDT6 Æ f1 ZDT2 Æ f1 KUR Æ f1 KUR Æ f1 ZDT6 Æ f1 KUR Æ f1 KUR Æ f1 KUR Æ f1 ZDT6 Æ f1

ZDT6 Æ f2 ZDT2 Æ f2 ZDT2 Æ f2 ZDT6 Æ f2 ZDT6 Æ f2 KUR Æ f2 KUR Æ f2 KUR Æ f2 KUR Æ f2 ZDT6 Æ f2 ZDT6 Æ f2 ZDT2 Æ f2 KUR Æ f2 KUR Æ f2 ZDT6 Æ f2 KUR Æ f2 KUR Æ f2 KUR Æ f2 ZDT6 Æ f2

Table 3 Example of data lists of components Notation

Data lists

x4(c1,2) x4(c2,1) x3(c3,1) x4(c3,1) x3(c4,1)

{0.0, 0.2, 0.4, 0.6, 0.8, 1.0} {0.1, 0.3, 0.7} {4.8, 0.2, 1.1, 1.9, 2.8, 4.0} {5, 2, 5} {0, 1, 2, 3, 4, 5}

After finalizing the system specification G, we use evolutionary algorithm to analyze the composition of G. Here the population size is set to 100 and generation number 200. To get rid of the effect of the initialization population, we run the algorithm 30 times and unify the result set by removing the dominated solutions. The result set can be further reduced by removing the individuals with the smallest crowding-distance. That is, in each round an individual with the smallest distance to its neighbors is found and removed from the set. It keeps on running until the result set is reduced to a fixed number, which is set to 20 here. The result of composition analysis using genetic algorithm is shown in Fig. 7. Fig. 7a and b show the Paretooptimal solutions in both objective and constraint spaces, respectively. Each small red rectangle in the figures represents a solution. There are 20 solutions presented. Fig. 7a demonstrates the tradeoff between objectives o1 and o2. Table 4 Aggregate functions of the system G QoS attributes

Aggregate function

a1 a2 a3 a4

v1 v2 v3 v4

Table 5 Objective/constraint functions of the system

¼ 2:0  v11 þ v12 þ v13 þ 2:0  v14 þ v15 þ v16 þ 2:0  v17 þ v18 þ v19 ¼ v21  v22  v25 ¼ maxðv33 ; v34 ; v36 ; v37 ; v38 ; v39 Þ ¼ v41 þ 0:8  v42 þ 2:0  v43 þ 2:0  v44 þ v45 þ 0:5  v46 þ v47

1

o o2 r1 r2

Objective/constraints functions 1

1

4

n =v +v n2 = v2 + v3 f1 = v3 f2 = v4

Operators Min Min 614 69

Fig. 7b shows the same 20 solutions, but in the constraint space. As the diagram shows, when the decision space is limited in a small scope, the user can have an exact impression on the whole decision space and it is much easier to make a final decision than exploring the whole space. For example, the result is helpful that increasing o1 from 11 to 15 cannot make big improvements on o2. An evolutionary algorithm is a partially randomized exploratory procedure. To see how well the algorithm converges, we compare it with a fully randomized algorithm. We generate 30 * 200 * 100 solutions randomly (which is the number of solutions explored in total by the evolutionary algorithm) and select the best solutions. The criteria for better solutions are defined as dominating solutions which are better in terms of the two optimization objectives or closer to the bound given in the two constraints. The best solutions from the fully randomized algorithm are presented in Fig. 8. Note that in Fig. 8 we keep all the Pareto-optimal solutions found by the randomized algorithm. The result shows that within the same run time, the randomized algorithm cannot produce feasible solutions for the given QoS constraints although some of them can satisfy the constraint r1 6 14. Also, these solutions are much worse than those generated by the evolutionary algorithm in terms of the objectives. 6. Conclusion and future research We have developed a QoS-oriented composition analysis model for component based embedded software development. The QoS analysis can assist system designers to make better design decisions in terms of satisfying system QoS constraints and optimizing QoS objectives by having better component selections and parameter settings for configurable components. An evolutionary algorithm is used to search in the solution space and obtain better solutions efficiently. We use a case study to illustrate the process of QoS-oriented composition analysis and demonstrate the effectiveness of using evaluation algorithm in searching Pareto-optimal design solutions in a large design space. Several future research directions following this work have been identified. Currently, our QoS analysis focuses on SDF computation model. Frequently, event is a needed concept in embedded applications. For example, even in SDF model, event-based concept can be used to model the exceptions encountered in data processing which needs to be propagated to and handled by multiple units in the system. We plan to generalize SDF model to the BDF model and introduce events into SDF/BDF model for

H. Ma et al. / The Journal of Systems and Software 79 (2006) 859–870

869

Fig. 7. Non-dominated feasible solutions. (a) Tradeoff of the solutions in terms of two objectives and (b) solutions in the constraint space.

Fig. 8. Randomized algorithm result. (a) Solutions in the objective space and (b) solutions in the constraint space.

QoS-oriented composition analysis. We plan to develop the corresponding techniques and tools for the extended model. Also, though SDF model can be very effective for the specification of digital signal processing applications and other pipe and filter based systems, it may not be sufficient for specifying some other embedded applications, such as process control systems and telecom systems. Furthermore, most general component compositions in many embedded software are still conventional procedure invocations. We will extend our composition analysis techniques to consider various composition models and a diverse range of application domains. Along a different direction, we are also looking into other aspects of the overall embedded software develop platform, including advanced techniques for component retrieval and repository management (Yen et al., 2001), component parameterization techniques for transforming existing components into reconfigurable ones for satisfying various QoS tradeoffs (Cooper et al., 2003), and automated QoS property measurement techniques. Acknowledgements This research was supported in part by the National Science Foundation under Grant No. EIA-0103709. We would also like to thank the reviewers for their thorough

comments which have greatly enhanced the quality of the paper. References Borriello, G., Chou, P., Ortega, R., 1995. Embedded system co-design towards portability and rapid integration. In: Sami, M., Micheli, G.D. (Eds.), Hardware/Software Co-Design: Proc. of the 1995 NATO Advanced Study Institute. Kluwer Academic Publishers, pp. 243–264. Buck, J.T., 1993. Scheduling Dynamic Dataflow Graphs with Bounded Memory Using the Token Flow Model, Ph.D. thesis, University of California, Berkeley. Buck, J.T., Ha, S., Lee, E.A., Messerschmitt, D.G., 1991. Ptolemy: A Mixed-Paradigm Simulation/Prototyping Platform in C++, Proc. C++ At Work Conference. Chankong, V., Haimes, Y.Y., 1983. Multiobjective Decision Making Theory and Methodology. North-Holland, New York. Conklin, J., Begeman, M.L., 1988. gIBIS: a hypertext tool for explanatory policy discussions. ACM Trans. Office Inf. Syst. 6 (4), 303–331. Cooper, K., Zhou, J., Ma, H., Yen, I.-L., Bastani, F.B., 2003. Code parameterization for satisfaction of QoS requirements in embedded software. In: Proc. of the Int. Conf. on Eng. of Reconfigurable Systems and Algorithms, pp. 58–64. Corne, D.W., Knowles, J.D., Oates, M.J., 2000. The pareto envelop-based selection algorithm for multiobjective optimization. In: Schoenauer, M. et al. (Eds.), Parallel Problem Solving from Nature—PPSN VI. Springer, Berlin, pp. 839–848. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T., 2002. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6 (2), 182–197.

870

H. Ma et al. / The Journal of Systems and Software 79 (2006) 859–870

Gao, T., Ma, H., Yen, I.-L., Khan, L., Bastani, F., 2006. A repository for component-based embedded software development, International Journal of Software Engineering and Knowledge Engineering, in press. Hamlet, D., Mason, D., Woit, D., 2001. Theory of software reliability based on components. In: Proc. of the 23rd Int. Conf. on Software Eng., pp. 361–370. Hissam, S.A., Moreno, G.A., Stafford, J., Wallnau, K.C., 2001. Packaging Predictable Assembly with Prediction-Enabled Component Technology, CMU/SEI-2001-TR-024. Khare, V., Yao, X., Deb, K., 2003. Performance scaling of multi-objective evolutionary algorithms. In: Proc. of the Second Int. Conf. on Evolutionary Multi-Criterion Optimization, pp. 376–390. Lee, E.A., Messerschmitt, D.G., 1987a. Synchronous data flow. Proc. IEEE 75 (9), 1235–1245. Lee, E.A., Messerschmitt, D.G., 1987b. Static scheduling of synchronous data flow programs for digital signal processing. IEEE Trans. Comput. C-36, 24–35. Lee, J., 1991. Extending the Potts and Bruns model for recording design rationale. In: Proc. 13th Int. Conf. on Software Eng., pp. 114–125. Muskens, J., Chaudron, M., 2004. Prediction of run-time resource consumption in multi-task component-based software systems. In: Proc. of the 7th Int. Symp. on Component-Based Software Eng. Ommering, R.V., Linden, F.V., Kramer, J., Magee, J., 2000. The Koala component model for consumer electronics software. IEEE Comput. 33 (3), 78–85. Steuer, E., 1986. Multiple Criteria Optimization: Theory, Computation, and Application. Wiley.

Steigerwald, R.A., 1993. Reusable component retrieval for real-time applications. In: Proc. IEEE Workshop on Real-Time Applications, pp. 118–120. Tran, Q., Chung, L., 1999. NFR-Assistant: tool support for achieving quality. In: IEEE Symp. Application-Specific Systems and Software Engineering and Technology, pp. 284–289. Wall, A., Larsson, M., Norstrom, C., 2002. Towards an impact analysis for component based real-time product line architectures. In: Proc. 28th Euromicro Conf., pp. 81–89. Wallnau, K.C., Hissam, S.A., Seacord, R.C., 2001. Building Systems from Commercial Components. Addison Wesley. Yen, I-L., Chen, I.-R., 1997. Reliability assessment of multiple-agent cooperating systems. IEEE Trans. Reliab. Yen, I-L., Khan, L., Prabhakaran, B., Bastani, F.B., Linn, J., 2001. An on-line repository for embedded software. In: 13th IEEE Int. Conf. on Tools with Artificial Intelligence, pp. 314–320. Yen, I.-L., Goluguri, J., Bastani, F., Khan, L., Linn, J., 2002. A component-based approach for embedded software development. In: IEEE Int. Symp. on Object-oriented Real-Time Distributed Computing (ISORC), pp. 402–410. Zitzler, E., Deb, K., Thiele, L., 2000. Comparison of multiobjective evolutionary algorithms: empirical results. Evol. Comput. 8 (2), 173– 195. Zitzler, E., Laumanns, M., Thiele, L., 2001, SPEA2: improving the strength pareto evolutionary algorithm for multiobjective optimization. In: Proc. of the EUROGEN2001 Conference, pp. 95–100.