Automated software design using ant colony optimization with semantic network support

Accepted Manuscript Automated Software Design using Ant Colony Optimization with Semantic Network Support Vali Tawosi , Saeed Jalili , S.M. Hossein H...

Download PDF

2MB Sizes 2 Downloads 28 Views

Report

PDF Reader
Full Text

Accepted Manuscript

Automated Software Design using Ant Colony Optimization with Semantic Network Support Vali Tawosi , Saeed Jalili , S.M. Hossein Hasheminejad PII: DOI: Reference:

S0164-1212(15)00144-2 10.1016/j.jss.2015.06.067 JSS 9537

To appear in:

The Journal of Systems & Software

Received date: Revised date: Accepted date:

12 August 2014 25 June 2015 28 June 2015

Please cite this article as: Vali Tawosi , Saeed Jalili , S.M. Hossein Hasheminejad , Automated Software Design using Ant Colony Optimization with Semantic Network Support, The Journal of Systems & Software (2015), doi: 10.1016/j.jss.2015.06.067

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

Highlights    

AC

CE

PT

ED

M

AN US

CR IP T



Ant colony optimization is applied, to automate detail design of software. Problem domain knowledge is involved in optimization phase to get better results. A new metric is introduced, useful for automated object-oriented software design. Optimization with semantic network support leaded to more autonomous design method. The proposed method is evaluated using 5 case studies and results are reported.

ACCEPTED MANUSCRIPT

Automated Software Design using Ant Colony Optimization with Semantic Network Support Vali Tawosi 1 [email protected]

Saeed Jalili 2 [email protected]

Faculty of Electrical & Computer Engineering Tarbiat Modares University Tehran, Iran

CR IP T

1,2,3

S.M. Hossein Hasheminejad 3 smh.hashemi@modares .ac.ir

Abstract

AN US

Software design is an important task that needs to be well performed. In this paper, a method for automated software design using Search-Based Software Engineering approach is proposed. This approach can solve software engineering problems using search algorithms. Ant Colony Optimization is used as the meta-heuristic search algorithm in both single-objective and multi-objective modes. Input data are the analysis phase artifacts and the output is in the form of early life cycle class diagram. To provide human designer’s background knowledge, a semantic network is used that is built upon the textual documents of analysis phase plus other resources like WordNet. This semantic network is used to name the output classes, and also to determine structural relations between classes. The proposed method is evaluated by some case studies and results are reported. The evaluation results show that using background knowledge beside optimization algorithm helps to achieve better results.

Key words: Search-Based Software Engineering, Automated Software Design, Object-Oriented Analysis and Design, Ant Colony Optimization. Introduction

M

1.

CE

PT

ED

Software design is a crucial step in the software development process. Achieving a good design seems to be difficult, especially when the system is large scaled and complex. The ability to support functional and nonfunctional requirements of the software system is a major specification of well-designed software system [1]. The design of a software system can be broken into two levels: architecture design and detail design [2]. The architecture design is mainly responsible for supporting the nonfunctional requirements, while the detail design of components is more involved with the functional requirements [3]. A good design that suitably meets software requirements has a significant effect on software project’s success, the quality of the anticipated software product and its development and maintenance cost. Hence, designing a software system is a hard skill to teach and master and demands experience and technical knowledge.

AC

In object-oriented paradigm, the detail design is in fact recognizing the software objects and assigning the software responsibilities to them [4]. Objects are abstractions of real-world entities that are affecting each other to perform the software functionalities. A preferable level of abstraction is needed to make it possible to reuse an object in different but related contexts. Reusing objects is a key factor in the object-oriented paradigm. Hence designing good objects which have just enough details (not more and less) and have been assigned with necessary functions is applauded, because it can be reused easily [5]. Capability of objects to support changes and modifications is another factor that plays a vital role in object-oriented design [4]. As software systems get larger and more complex, software design becomes more difficult and error prone. Therefore, the automation (or at least semi-automation) of software design has been tackled in many researches up to now [6]. The objective of automatic software

ACCEPTED MANUSCRIPT

CR IP T

design is to increase creativity and novelty in design, enriching the problem solving activity for the designer, reducing human effort and errors and hence reducing cost of the software development. Despite the fact that the software design automation is meant to replace human designer, nowadays it is just able to be an assistant to designers and help them to make decision [6]. It means that most of approaches used for the automated software design are still dependent to human interaction and decision making. There are mainly two approaches to automate software design. The first approach is Search-Based Software Engineering (SBSE) that treats software engineering problems as search problems and harnesses a search algorithm to find a solution [7]. The second approach is using Natural Language Processing (NLP) methods and sometimes Semantic Networks (SNs) to analyze textual documents and extract a primitive design. There is also a new approach using clustering methods introduced recently [8].

M

AN US

Software design can be treated as an optimization problem, since there is a large search space of candidate solutions for a design problem; the efficient solution is unknown but it is possible to define a suitable fitness function to discriminate that, and the production of candidate solution automatically doesn’t need much effort. Researches using SBSE usually defined their fitness function based on design quality metrics like cohesion, coupling and complexity [6]. Thus resulting design is very good in these metrics. But this approach doesn’t concern much about the class understandability or the capability of reuse because quantizing this kind of metrics is difficult. This shortcoming leads to the appearance of nameless classes in design, and often class attributes are not semantically belonged to the class. Another shortcoming of this approach is the difficulty in recognizing structural relations like composition, aggregation and inheritance between classes.

PT

ED

Using NLP methods to analyze textual documents derived from software analysis phase and extract candidate classes and their responsibilities has a history as long as SBSE [9]. This approach uses heuristics to recognize candidate classes and assigned responsibilities to the classes. Often semantic coherence in a class produced with this approach is more than searchbased results. This approach, though, doesn’t concern about other object-oriented metrics like cohesion, coupling and complexity. This shortcoming makes resulting classes inefficient and vulnerable to incoming changes and modifications.

AC

CE

The proposed method in this paper combines two aforementioned approaches to take advantage of both and overcome with their shortcomings. The proposed method consists of three main stages. First stage uses NLP methods to extract responsibilities and build a Problem Domain Semantic Network (PDSN). The next stage employs a meta-heuristic search algorithm (explicitly Ant Colony Optimization) to find the best design respect to objective features. Optimization of the software design is considered as a multi-objective process, hence this research harnessed optimization algorithm in both single and multi-objective modes to compare the results. The last stage performs a mapping between found classes and PDSN concepts to name the classes, and recognize structural relations between the classes. The proposed method uses a new metric to quantize class meaningfulness that helps the search algorithm to produce classes more close to real-world ones and thus more understandable to human, reusable and maintainable. Another advantage of the proposed method is the reduction of human expert interaction to very low level, since there is no need to a design expert, and the only retained human interaction is limited to the optional edit of the PDSN, by domain expert who doesn’t need any

ACCEPTED MANUSCRIPT

design skills. This achievement will help ordinary people to make their own software without depending on a design expert, in near future.

CR IP T

The major challenge of the proposed method is accuracy and completeness of the PDSN, as it is building on the output of the automated NLP tools, extracting data and information from free format text, written in natural language. The quality and accuracy of the PDSN is totally depended on the quality of the textual documents fed in the NLP tools, and the accuracy of the NLP tools. Therefore the overall quality of the output design is depended on the quality of the input materials and as the old saying “garbage in, garbage out”, one couldn’t expect the proposed method to produce a quality design, while he/she didn’t take effort to prepare a quality analysis documents, or correct and complete the PDSN.

AN US

Experimental results revealed that the proposed method is able to handle small and middle-scaled design problems easily, and is applicable to large-scale problems. Results also showed that the proposed method produces design classes that are close to that of a human designer. This proves the ability of the proposed method to compete with other methods while they use a higher level of human interactions and needed more manual works by human. The rest of this paper is organized as follows: section 2 describes some preliminary definitions and concepts. Section 3 briefly reviews related works and section 4 describes the proposed method in detail. In section 5, experimental setup is explained and in section 6, evaluations and experimental results are presented. Section 7 discuses results and surveys advantages and limitations of the proposed method. Finally, section 8 concludes and suggests future works. Background

M

2.

2.1

ED

In this section, after a brief description of the automated design problem, some basic definitions are provided. Automated Software design

AC

CE

PT

There are many complex problems in software engineering area that needs knowledge and resources to be solved by a human software engineer. In many cases like requirements engineering, software design, test and software project management there are conflicting objectives that makes these problems even harder to solve [7]. Like other engineering disciplines, software engineers try to find optimal solutions which makes a tradeoff between conflicting objectives [6]. As system gets larger and more complex, finding even a near optimal solution would be very time-consuming and exhausting to human engineer. Especially in the case of software design, a human designer is faced with a large number of options to be selected. Sometimes solution space is so large that it seems impossible for human to be able to find the best solution. As an aspiring saying, “hiring software to make software” tempted software engineers to create automated methods for software development. Software design, one of the most difficult phases of software development process, needs human expertise with experience, knowledge and design skills. At first place, an automated design method has to consider all the options that a human designer may face while designing software to find an optimal (or near optimal) combination of decisions that makes a trade-off between conflicting objectives. Moreover an advanced automated design method will bring creativity and novelty to design.

ACCEPTED MANUSCRIPT

AN US

CR IP T

Automated software design, at its apogee, meant to relief human by designing software autonomously, without any interaction from human designer. Recent researches have made great advances to fulfill this will, but still need some degree of human design expert interaction. Human design expert is a software engineer with design skills and experienced knowledge. In order to solve automatic software design problem, first we need to settle with a representation of software design. A common representation of object-oriented design is UML class diagram [4]. Analysis of the problem domain reveals required responsibilities of the software system-to-be, where a responsibility is an obligation to perform a task or know information. In the object-oriented paradigm, responsibilities are typically modeled as operations, together with the attributes that they manipulate. A class diagram consists of classes of software objects that are abstracted entities with responsibilities [10]. If we consider having n responsibilities, any combination of assignment of responsibilities to classes would be a possible class diagram. As the number of classes in the optimal class diagram is unknown, we can equalize this problem with the set partitioning problem. Division of a set with n elements into a union of non-overlapping and non-empty subsets is equivalent to a class diagram, while the so called subsets are analogues of the classes. The total number of partitions of an n-element set is the Bell number (Bn) [11]. Bell numbers satisfy the recursion represented in relation (1) [12]. It is clear that we are faced with a considerably large search space which has to be explored by a search algorithm. But only a little number of the class diagrams in this search space is acceptable.

𝐵𝑛+1 = ∑𝑛𝑘=0(𝑛𝑘) 𝐵𝑘 , 𝐵1 = 1

(1)

Ant Colony Optimization

PT

2.2

ED

M

Discrimination of the best design in the search space requires a suitable fitness function. A fitness function can be suggested based on the characteristics of a good design. Object-oriented design metrics quantizes the criteria of a good design (explained in section 2.3). But the conflicting nature of these criteria forces us to seek an optimal value for them that would make a tradeoff [6]. As we are witnessed to successfully application of meta-heuristic search algorithms in optimization of problems in other engineering disciplines [13], it is possible to apply it in search for optimal class diagram.

CE

Ant system (AS) is a meta-heuristic search algorithm, inspired by the foraging behavior of real ants, introduced to solve traveling salesman problem (TSP) [14]. AS become popular very fast and soon after, other variants are introduced. Max-Min Ant System (MMAS), one of the most successful variants of AS, introduced in 1999 [15].

AC

To solve a problem, AS uses a colony of artificial ants. Ants move on a media (typically a graph) in various directions to build a solution (typically a path). Artificial ants, just like real ants, deposit pheromone on their path to inform other members of the colony. This kind of information flow is called Stigmergy [16]. The intensity of pheromone deposited by an ant is proportional to the quality of the solution that the ant has built. Pheromone trials are used by other ants and the same ant in different times, as a guide to achieve better solutions. Ants use heuristic information –provided by external source- and pheromone trials to find their way on the media, toward the best solution. In fact often solution is one (or more) special path(s) on the graph with objective features, and after some iteration, ants destined to converge towards the special path(s).

ACCEPTED MANUSCRIPT

In AS pheromone deposition is possible either on edges or on vertexes of the graph [14]. In each movement, ants combine pheromone and heuristic information from each feasible neighbor edge (or vertex), to compute attractiveness of the next place to move. Then a probabilistic method is used to select next move. When all ants made their tour (build a solution), evaporation process takes place by which a portion of pheromone on all edges (or vertices) is reduced. Evaporation helps ants to forget less attractive paths that are explored in first iterations [14]. After pheromone evaporation, ants deposit pheromone on their own built path.

AC

CE

PT

ED

M

AN US

CR IP T

MMAS is the greedy variant of the AS and is able to find good solutions without using any heuristic information [15]. MMAS differs with basic AS in four cases [14]. First, it uses either the best path of iteration or the best so far path, found in previous iterations, to update pheromone trials. This change can lead to stagnation situation in which all ants biased to the best so far path, because of the excessive growth of pheromone trials on a good, but maybe suboptimal path. The second change has been made to prevent stagnation situation: MMAS limits the possible range of pheromone trials to the interval [𝜏𝑚𝑖𝑛 , 𝜏𝑚𝑎𝑥 ]. Third change is to initialize the pheromone trials with upper pheromone trial limit. Final change is re-initialization of pheromone trials when system approaches stagnation situation or when no improvement is achieved for a certain number of iterations. Using a local search with MMAS is highly recommended [17]. Figure 1 shows the flowchart of MMAS algorithm adopted from Dorigo [14] and used in this study. Pheromone trials bounds are computed dynamically based on best so far path fitness and are updated every time new best so far path is found.

ACCEPTED MANUSCRIPT

start

Make N ants and Initialize Pheromone Matrix

Iteration < maxIteration

No

Get Best Solution on Pheromone Matrix

CR IP T

Yes End

There isn’t an improvement in last RC iterations and (maxIteration-Iteration) > RC

No

Reset Pheromone Matrix

AN US

Each ant make a solution using random-proportional selection rule

Update Pheromone Trial

Yes

Evaluation of Solutions

Select ant for updating Pheromone Trial (Either best so far or iteration best)

Selection of iteration best ant Iteration best solution = ant[ibest] No

ED

M

Update Pheromone limits based on the fitness of the Best so far solution

Best so far solution = ant[ibest]

If ant[ibest ]is the best so far

Yes

Figure 1. Max-Min Ant System (MMAS) [14].

CE

PT

Using ACO to solve multi-objective optimization problems is subjected by many researchers. Different changes applied to the basic ACO algorithm to make it able to handle multi-objective problems. As López and Stützle [18] describes it, multi-objective ACO (MOACO) is composed of an underlying ACO algorithm with its solution construction and pheromone update policies, plus specific algorithm components, make it amendable to tackling multiobjective optimization problems.

AC

One of the successful components that extended ACO to MOACO, is multi-criterion Ant that uses multiple pheromone matrices [19], usually one per objective. In this extension ants use multiple pheromone information by combining them using a weighted aggregation function. Deferent variants of multi-criterion Ant algorithms use deferent methods for updating pheromone matrices. One uses a set of non-dominated solutions found in the iteration, to update all of the pheromone matrices with a pheromone value relative to the set fitness [19]. Another uses the best ant of objective to update pheromone matrix of corresponding objective [20]. The former approach only works for heterogeneous pheromone matrices, that is, when different solution components are mapped to matrices [18]. The latter is suitable when all the pheromone matrices are mapped to the whole solution component but each stores pheromone traces for one objective.

ACCEPTED MANUSCRIPT

2.3

Object-oriented Design Metrics

CR IP T

Evaluating software artifacts is the permanent concern of software engineers and software project managers [21]. They are always willing to evaluate software in its development life cycle, before it is actually produced. Early life cycle software artifacts evaluation can give a hint of success or failure of the project before it happens. This necessity leaded software engineering to the introduction of software metrics [22]. Software design metrics are a variety of metrics for design artifacts evaluation that quantizes the design criteria. Since the main artifact of design phase in object-oriented paradigm is the class diagram, design metrics are summarized to class diagram evaluation metrics. These metrics can be categorized in four major groups: Cohesion, Coupling, Complexity and Object-oriented metrics [23]. “Coupling refers to the degree of independence between parts of a design, while cohesion refers to the internal consistency within parts of a design”, this is the definition of cohesion and coupling by Chidamber and Kemerer [24]. Therefore minimization of coupling and

AN US

maximization of cohesion in the design shall be quested. Lieberherr et al. [25] introduced realization of cohesion and coupling metrics in object-oriented design.

ED

M

Having method calls between classes (but with a reasonable limit) is necessary for a class diagram. The objective of building a system is to collect a set of independent components that will perform system functionalities in collaboration with each other [3]. If we have a collection of completely independent components that never collaborate, we will have a discrete set of components, not a system! So coupling at a minimum level –that satisfies system behavioral requirements- is necessary for a system. In the other words we are not chasing a coupling value of zero but we want it as minimum as possible for a system. In order to compute overall coupling value for a class diagram, we summed up all data element and method usages between every two classes. A linear combination of method-attribute coupling and methodmethod coupling will give the coupling value.

CE

PT

Cohesion value of a design is average value of inner cohesion for all classes in the design. Inner cohesion for a class is computed as linear combination of data cohesion and behavioral cohesion of the class. Every method-method usage and method-attribute usage that starts and ends in the same class is counted as class cohesion value. Cohesion and coupling are complement metrics – high coupling results in low cohesion and high cohesion results in low coupling – and this hints us to look for a balance point. Unfortunately this balance point is not identical for all software systems.

AC

The real meaning of software complexity is “difficulty in understanding, manipulating and maintenance of software” [26]. In fact complexity is a mental issue that affects developer’s ability of understanding the software. Hence in order to overcome the complexity, we divide a system into smaller parts in a way that its understanding, manipulating and maintenance become easier [4]. A software component is composed of classes and a class is composed of responsibilities. Number of elements in each level should be hold in a range that a typical human’s brain is able to handle at a moment. There are many studies in psychology to determine this magical number. Miller’s law says that the number of objects an average human can hold in working memory is 7 ± 2 [27]. Complexity measures used for a class diagram are number of classes in the diagram, number of responsibilities in a class and the standard deviation of number of responsibilities in

ACCEPTED MANUSCRIPT

CR IP T

the classes. The complexity metric tries to hold the number of elements in each level around the magical number (7 ± 2) with a degree of freedom which can be altered by user. El-Emam et al. [28] studied the optimal class size in the object-oriented software design against faultproneness of a class. Considering the Java as the target programming language, El-Emam et al. reached to a threshold of 6~19 methods and 1~14 attributes per class but the median number of attributes and methods in a common class is found 7 and 9 respectively. So it suggests if we hold average number of attributes and methods of classes around 7 and 9 respectively, the fault density of classes will be low. This study inspired us to keep number of responsibilities for a class in a variable range with a tendency to hold around 7 attributes and 9 methods per class.

AN US

Object-oriented paradigm introduced new concepts in software engineering. Abstraction, inheritance, encapsulation, polymorphism and message passing are among new concepts introduced by object-oriented paradigm. Inheritance as the main concept in objectoriented paradigm has got great attentions to define metrics on it. Lanza et al. [29] introduced some metrics on inheritance in object-oriented design that inspired our object-oriented metrics. The average number of derived classes and the average height of hierarchies are two measures used for computing object-oriented metrics.

M

Table 1 shows the aforementioned metrics in detail and the formula for computing them where it is exhibitable. It is worth to mention that all the formulas are inspired by the metric definition from the references provided in the table. All the measures presented for a class are averaged over all of the classes in design before using in aggregation function to achieve an overall mean value of measure for design. The following notations are used to describe metrics in Table 1, where S stands for the software system at hand: 𝐴 = {𝑎1 , 𝑎2 , … , 𝑎𝑘 } denotes the set of attributes in S.



𝑀 = {𝑚1 , 𝑚2 , … , 𝑚𝑙 } denotes the set of methods in S.



𝑅 = {𝑟1 , 𝑟2 , … , 𝑟𝑡 } denotes the set of responsibilities in S. In fact 𝑅 = 𝐴 ∪ 𝑀 and thus 𝑡 = 𝑘 + 𝑙.



𝐶 = {𝐶1 , 𝐶2 , … , 𝐶𝑐 } denotes the set of classes in S.



𝑀(𝐶𝑛 ) denotes the set of methods belonging to class 𝐶𝑛 .



𝐴(𝐶𝑛 ) denotes the set of attributes belonging to class 𝐶𝑛 .



𝑅(𝐶𝑛 ) denotes the set of attributes and methods (responsibilities) belonging to class 𝐶𝑛 .



𝐴𝑟𝑔(𝑚) denotes the set of attributes that method 𝑚 uses as its input arguments.



𝒰𝑚−𝑎 (𝑚𝑖 , 𝑎𝑗 ) = {1 𝑖𝑓 𝑚𝑖 uses 𝑎𝑗 as its input argument 0 otherwise



1 𝒰𝑚−𝑚 (𝑚𝑖 , 𝑚𝑗 ) = { 0

AC

CE

PT

ED



𝑖𝑓𝑚𝑖 calls 𝑚𝑗 otherwise

ACCEPTED MANUSCRIPT



Data

usage

between

classes 𝐶𝑖 and 𝐶𝑗 is

denoted

and

computed

as:

𝒰𝒞𝑚−𝑎 (𝐶𝑖 , 𝐶𝑗 ) = ∑∀𝑚𝑝 ∈𝑀(𝐶𝑖 ) ∑∀𝑎𝑞 ∈𝐴(𝐶𝑗 ) 𝒰𝑚−𝑎 (𝑚𝑝 , 𝑎𝑞 ) 

Method usage (method-method call) between classes 𝐶𝑖 and 𝐶𝑗 is denoted and computed as: 𝒰𝒞𝑚−𝑚 (𝐶𝑖 , 𝐶𝑗 ) = ∑∀𝑚𝑝 ∈𝑀(𝐶𝑖) ∑∀𝑚𝑞 ∈𝑀(𝐶𝑗 ) 𝒰𝑚−𝑚 (𝑚𝑝 , 𝑚𝑞 )

Metric

Cohesion [24], (CH)

Formula

Method-Attribute Coupling of Design

𝑀𝐴𝐶𝑃(𝐷) = ∑

Method- Method Coupling of Design

𝑀𝑀𝐶𝑃(𝐷) = ∑

Data Cohesion of Class

𝐷𝐶𝐻(𝐶𝑖 ) =

Behavioral Cohesion of Class

𝐵𝐶𝐻(𝐶𝑖 ) = 𝒰𝒞𝑚−𝑚 (𝐶𝑖 , 𝐶𝑖 )

[25]

[25]

𝐷𝐶𝑋(𝐷) =

Design Complexity*

[28]

Average Derived Classes for Design***

ED

Object-Orientedness

M

Class Size Standard Deviation**

[29] (OO)

∀(𝐶𝑖 , 𝐶𝑗 ) ∈𝐷 , 𝑖≠𝑗

(|𝐷| −

𝐶𝑆𝐶𝑋(𝐶𝑖 ) =

Class Size Complexity*

𝒰𝒞𝑚−𝑎 (𝐶𝑖 , 𝐶𝑗 )

𝒰𝒞𝑚−𝑚 (𝐶𝑖 , 𝐶𝑗 )

∑∀𝑚∈𝐶𝑖|𝐴𝑟𝑔(𝑚) ∩ 𝐴(𝐶𝑖 )| ∑∀𝑚∈𝐶𝑖|𝐴𝑟𝑔(𝑚)|

(

Complexity [27], (CX)

∀(𝐶𝑖 , 𝐶𝑗 ) ∈𝐷 , 𝑖≠𝑗

AN US

Coupling [24], (CP)

Measure

CR IP T

Table 1. Object-oriented design metrics (used in this paper).

Average Hierarchy Height for Design***

𝑡

𝑥+𝑦 2 𝑡

𝑥+𝑦

)

2

)

(|𝐴(𝐶𝑖 )| − 𝑥)2 (|𝑀(𝐶𝑖 )| − 𝑦)2 + 𝑎×𝑥 𝑎×𝑦

1 2 𝐶𝑆𝑆𝐷 (𝐷) = √ ∑ (|𝑅(𝐶𝑖 )| − 𝐴𝑣𝑒𝑟𝑎𝑔𝑒𝐶𝑙𝑎𝑠𝑠𝑆𝑖𝑧𝑒(𝐷)) |𝐷| 𝐶𝑖 ∈𝐷 𝐴𝐷𝐶(𝐷) =

|𝐷| − |𝐻𝑖𝑒𝑟𝑎𝑟𝑐h𝑦𝑅𝑜𝑜𝑡𝑠(𝐷)| |𝐻𝑖𝑒𝑟𝑎𝑟𝑐h𝑦𝑅𝑜𝑜𝑡𝑠(𝐷)|

𝐴𝐻𝐻(𝐷) ∑𝐻𝑖𝑒𝑟𝑎𝑟𝑐h𝑦𝑅𝑜𝑜𝑡𝑠(𝐷) 𝐷𝑖𝑠𝑡𝑎𝑛𝑐𝑒𝑇𝑜𝐷𝑒𝑒𝑝𝑒𝑠𝑡𝐷𝑒𝑠𝑐𝑒𝑛𝑑𝑒𝑛𝑡(𝐶) = |𝐻𝑖𝑒𝑟𝑎𝑟𝑐h𝑦𝑅𝑜𝑜𝑡𝑠(𝐷)|

PT

* "𝑥" and "𝑦" are gauges of optimal class size, both modifiable by user, depending on target

system type and

requirements. Inspired by [27], [28], we have set 𝑥 = 7 and 𝑦 = 9. "𝑎" is a degree of freedom to let the optimization algorithm feel free to produce classes of a bit different sizes. We have set 𝑎 = 2. 1 ** Average class size in design D is computed as: 𝐴𝑣𝑒𝑟𝑎𝑔𝑒𝐶𝑙𝑎𝑠𝑠𝑆𝑖𝑧𝑒(𝐷) = |𝐷| × ∑𝐶𝑖∈𝐷|𝑅(𝐶𝑖 )|

CE

*** Hierarchy roots of design D is the set of classes in D that are not derived from another class; formally: 𝐻𝑖𝑒𝑟𝑎𝑟𝑐h𝑦𝑅𝑜𝑜𝑡𝑠(𝐷) = {𝐶|𝐶 ∈ 𝐷 ∧ 𝐶 𝑖𝑠 𝑛𝑜𝑡 𝐷𝑒𝑟𝑖𝑣𝑒𝑑 𝐶𝑙𝑎𝑠𝑠}

Ordered Weighted Averaging

AC

2.4

One of the concerns in multi-objective optimization is to give appropriate weights to objectives at the aggregation function. The Ordered Weighted Averaging (OWA) operator, introduced by Yager [30], is a largely accepted and widely used aggregation operator. In OWA instead of weighting criteria, weights are put on components of rating vectors after a preliminary ranking of the individual ratings [31]. Assume 𝐴1 , 𝐴2 , … , 𝐴𝑛 are n criteria of concern in a multi-objective problem. Let X be some proposed solution and for each criteria 𝐴𝑖 (𝑋) = 𝑎𝑖 , 𝑎𝑖 ∈ [0, 1] for 1 ≤ 𝑖 ≤ 𝑛 indicates the degree to which X satisfies 𝐴𝑖 . Let 𝑉(𝑋) = (𝑎1 , 𝑎2 , … , 𝑎𝑛 ) be the objective value vector for X. Considering a weight vector 𝑊 = (𝑤1 , 𝑤2 , … , 𝑤𝑛 ) for OWA where 𝑤𝑖 ∈ [0, 1] 𝑎𝑛𝑑 ∑𝑛𝑖=1 𝑤𝑖 = 1 , the

ACCEPTED MANUSCRIPT

overall decision function is 𝐷(𝑋) = 𝐹(𝑏1 , 𝑏2 , … , 𝑏𝑛 ) where 𝑏𝑖 is the ith largest value in 𝑉 and 𝐹 is a mapping function of 𝐼 𝑛 → 𝐼, 𝐼 = [0, 1]. 𝐹(𝑏1 , 𝑏2 , … , 𝑏𝑛 ) = ∑𝑛𝑖=1 𝑤𝑖 𝑏𝑖 is called an OWA operator. It is worth mentioning the fact that the weights are associated with a particular ordered position rather than a particular objective. In the other words, 𝑤𝑖 is the weight associated with the ith largest element in 𝑉(𝑋) whichever objective it belongs. A large variety of OWA operators have been proposed in the literature [32], [33]. Semantic Network

CR IP T

2.5

Semantic network is a formal representation of knowledge, using concepts and their relations. Semantic networks are widely used for making inference and knowledge retrieval in artificial intelligence applications [34]. Relations between concepts in semantic networks usually are “synonym”, “antonym”, “kind of” (a.k.a. “is a”), “part of”, “generalize/specialize of”, and etc. Lexical databases like WordNet are a kind of semantic networks that are used for omni-purposes [35].

3.

AN US

The semantic network used in this study is aimed to represent a concept map of design problem domain. Its building blocks (i.e. concepts and relations) are extracted from free-format text of software analysis documents, and then in the case of necessity enriched using concepts from lexicons (based on source language, i.e. WordNet for English) and at last edited and corrected by domain experts. Related Works

M

Treating software engineering problems as optimization problems was an idea that emerged in 1976 [36] with a slow start but has been used frequently since 1992. This field of

PT

ED

study is called Search-Based Software Engineering (SBSE) by Harman and Jones [37] at 2001. A review of SBSE applications and usage has been reported in [7]. An updated report is also maintained by Zhang [38] that covers publications of 1976-2013. Until 2013, SBSE has been applied in many software engineering fields like testing and debugging, management, design, distribution, maintenance, requirement specification and software/program verification.

AC

CE

Software design as an optimization problem has been tackled in researches since 1998. A survey of search based software design researches is made by Räihä [6]. This survey concluded that number of studies in the area of the search based software design, especially object-oriented design and refactoring, has a dramatic increase in recent years. In these studies the most used search method is evolutionary algorithms like genetic algorithm. In the following we will review some of the recent and prominent studies in automated software design. We reviewed the studies in two categories. First category used search methods to find the optimal class diagram while the second one used NLP methods (and sometimes domain concept models) over RD to extract a primitive class diagram. A great number of studies in search based software design are focused on improving an existing class diagram. In object-oriented paradigm this task is called Refactoring [39]. Refactoring in class diagram level, concerns reordering assignment of responsibilities to the classes (a.k.a class responsibility assignment or CRA) after modification and change in software. The aim of refactoring is to improve the efficiency of classes and their interactions, meanwhile not altering software functionality. Bowman et al. [10], [40], Glavaš and Fertalj [41]

ACCEPTED MANUSCRIPT

and Saini and Sharma [42] used meta-heuristic algorithms to solve CRA problem. Bowman et

AN US

CR IP T

al. [10], [40] used multi-objective genetic algorithm with strength Pareto approach. Evaluating the proposed method, Bowman et al., took a sample system with an optimal class diagram and made some changes in the diagram and then observed how well it fixes the changes made to the original class diagram. The objective of Glavaš and Fertalj [41] was to compare different meta-heuristic algorithms to specify which one is more suitable to solve CRA problem. They found that SA, HC, PSO and GA produce the better results, respectively. Masoud and Jalili [8] proposed a clustering-based algorithm to solve CRA problem. In the other approach, some studies tried to design software from scratch where there is nothing except the software requirements and analysis documents. In this case, since there is no a base class diagram, problem of producing a class diagram is much more difficult. Räihä et al. [43], [44] used analysis artifacts for automated software design. Räihä et al. stated that in the case of automated software design, the most challenging issue is the way that functional requirements are represented. Therefore they used CRC (Class-responsibility-collaboration) cards to build a dependency graph as input to their method in which uses genetic algorithm as its search method. Simons and Parmee [45]–[47] got farther and started with use case specifications. Using a preprocessing on the use case specifications, data members (nouns) are considered as attributes and actions (verbs) as methods. Relations between data members and methods are detected by some heuristic rules, and named "use" relations between members of two sets. Simons and Parmee [45]–[47] used human design experts interaction to guide the search process.

AC

CE

PT

ED

M

In the second category, using NLP methods and Semantic Networks to extract objectoriented class diagram from requirement documents resulted in development of some tools. These tools analyze input text using NLP methods and attend to automatically (in some cases not fully automatic) produce a class diagram that can be suggested to the human designer. SMART [48] is a semi-automatic tool that uses a heuristic classifier engine programmed in Lisp to extract class diagram from free-format text. SMART uses patterns to search object process relations and produce sentences in object-process language (OPL). NL-OOPS [49], [50] also produces an object model based on input requirement documents. NL-OOPS uses LOLITA as its NLP engine alongside a semantic network to recognize objects. NL-OOPS cannot discriminate classes from attributes. CM-Builder [51] is a CASE tool with the ability of analyzing free- format text and performing domain independent object-oriented analysis. CMBuilder produces a primitive class diagram with candidate classes with candidate relations. Zhou and Zhou [52] proposed a method that uses NLP tools to analyze requirement documents and extract software class diagram. Their proposed method takes advantage of domain ontology to improve efficiency of concept modeling. Kothari [53] used NLP tools to determine parts of speech in the requirement sentences and then uses some rules to recognize class candidates. All of the aforementioned studies in first category used a level of human design expert interaction. Although the ultimate goal of the automated software design is to eliminate human design expert from the process, but this is still a tough goal to reach. Studies in first category tried to find an optimal class diagram that satisfies software design criteria. In most cases using powerful optimization methods lead to a class diagram with satisfactory metric values, but classes in the diagram are not recognizable as any real world concept. This phenomenon (over optimization) makes class diagram difficult to understand by human and hence user will not be able to reuse, extend or maintain it. Masoud and Jalili [8] using the clustering-based

ACCEPTED MANUSCRIPT

solution to CRA problem, showed that less powerful methods may result in closer designs to human design expert. Glavaš and Fertalj [41] concluded that available software design metrics are not enough to get to a good design and suggest it’s better to have a semi-automated method to design software. We introduce a new metric in our experiment that helped to compensate the shortcomings of these studies.

CR IP T

Studies in second category didn’t concern about software design metrics. They merely come with a final class diagram that satisfies design metrics; instead they produce a suggested class diagram that must be revised by human design expert. But the idea of using NLP methods to analyze requirement documents and using domain ontologies to make concept models is useful to automated software design.

AC

CE

PT

ED

M

AN US

Table 2 shows a brief description of other automated design methods and the proposed method in this paper, from different aspects like approach, form of input and output, level of human interaction, metric suite and extra resources have been used in method.

CR IP T

ACCEPTED MANUSCRIPT

Table 2. A brief description of other automated design methods and the proposed method in this paper, from different aspects. Human interaction in input phase

SBSE (Refactoring, CRA)

Class Diagram

improved Class Diagram

none

SBSE (CRA)

Responsibility Dependency Graph (RDG)

Class Diagram (an abstracted domain model with nameless classes)

building RDG

SBSE (CRA)

Class Diagram, Sequence Diagram

a set of optimized Class Diagrams

none

Use-Case Document (analyzed manually)

Class Diagram (with nameless classes)

[41] Saini, Sharma

[42] Simons, Parmee [45]–[47]

SBSE (Design)

Masoud, Jalili [8]

Clustering (Design, CRA)

Kothari

[53]

primitive Class Diagram

NLP (Design)

Proposed Method

SBSE+NLP+ Semanticnet (Design)

Textual Use-Case Document

AC

CE

Zhou, Zhou [52]

NLP + Ontology (Design)

primitive Class Diagram

Class Diagram

Human interaction in optimization phase

Human interaction in output phase

AN US

Glavaš, Fertalj

Output

none

operation and data element extraction, recognition of dependencies

M

[10], [40]

Input

ED

Bowman et al.

Approach (aim)

PT

Research

use of design expert to guide the search algorithm use of design expert to guide the search algorithm

Using extra resources

Coupling Cohesion

none

selecting the desired Class Diagram

Coupling Cohesion Modularization quality Coupling Cohesion Dependency

none

Coupling Cohesion Coupling Cohesion Complexity

none

none Preparing the Domain Ontology limited participation of domain expert (optional)

Metric suit

none NA

none

NA

Coupling Cohesion Complexity Object-orientedness Meaningfulness

Domain Ontology Semantic Network (representing domain knowledge)

ACCEPTED MANUSCRIPT

4.

Proposed Method

Stage 1: Preprocessing

Requirement Requirement and and Use Use Case Case Documents Documents

NLP Tools

Data, Behavior and Relations

AN US

Edit & Complete (Optional)

Optimization

M

Domain Domain Expert Expert

Optimal Class Diagram Mapping Algorithm

ED

Concept Mapping

PT

Stage 3: Postprocessing

Stage 2: Optimization

PDSN

Optimal Class Diagram production

CR IP T

The proposed method has three stages: The first stage is preprocessing of input material that prepares data for the next stages; the second stage is searching among design candidates to find a near optimum solution and the final stage is post processing output design to name the classes and recognize the relationships between them. A schematic view of the proposed method is depicted in Figure 2.

Final Class Diagram

Figure 2. The proposed method for the automated software design.

Preprocessing: Responsibilities and Relationships Extraction

CE

4.1

AC

Software performs its functionality with its services and each service performs its action with manipulating data elements and/or calling other services. Data elements and services in combination are called responsibilities of the software. Therefore, there is a “usage” relationship between responsibilities. These usage relationships are used to compute some metrics over a design to evaluate the design quality. There are manual methods that help engineers to recognize responsibilities of a software system [54]. These methodologies follow some approaches like grammatical analysis, use-case driven, common class pattern and class responsibility collaboration (CRC) cards [55]. Masoud and Jalili [8], [47] used a combination of grammatical analysis and use-case driven approaches to recognize software responsibilities and relations from requirements documents.

ACCEPTED MANUSCRIPT

They considered nouns as data elements and verbs as services, and then recognized responsibilities and their relations manually, and provide them as an input of the optimization process. We used the same method of recognition, but automated by NLP methods. When it comes to an automated approach, it is clear that all the words in a requirement document are not the responsibilities of the anticipated software. But how we can filter the responsibilities from other words? We came up with this idea that only the words and phrases that are participated in a usage relationship are responsibilities.

AN US

CR IP T

We have considered three kinds of relations: behavioral, semantic and structural. Socalled “usage relations” are behavioral, because they realize the functionalities that the software is meant to accomplish. Semantic relation holds between two words that are synonyms (i.e. “display” and “show”). Considering semantic relation is necessary because synonym words maybe used for referring to analogous concepts in documents, but we had to deal with concepts rather than different words of one concept. We used the most occurred word in the document as representative to the concept it carries, and then other synonym words in the document are semantically related to that. In order to recognize semantic relations between words, we use WordNet 2 [56], a lexicon containing more than 117000 synsets (group of words referring to the same concept). Structural relations comprise most of the relations between responsibilities. They are depicting hyponyms and Part-Meronyms (being part of something) that will put an organization among concepts. A set of heuristics are used to extract the structural relations from text. A brief description of these heuristics with examples is presented in Table 3. Table 3. Heuristics for structural relations extraction from free text.

3 4 5 6

Generalization/Specialization (is a) Composition or Whole/Part (part of) Inclusion (member of) Aggregation (has) Association Synonym

M

2

Rule

NP* + is a + NP

NP + [part of | composed of] + NP

ED

1

Relation Type

PT

No.

NP + [member of |includes] + NP NP + has + NP [Adj** + NP] | [the + Adj + of the + NP] | [NP+ ‘s + Adj] (in the same synset on WordNet)

Example

Member is a Customer that … Financial department is a part of company …. Each Customer is a member of a User Group Every Customer has a unique IdNo. The Color of the Car determines … Show ≈ Display

CE

*NP: Noun Phrase, **Adj: Adjective

AC

Elicitation of behavioral relations from requirement documents requires a syntactic analysis on text. Although it is said that there is no firm standard for writing use-case description, but there is some suggestions [57], [58]. As Cockburn [59] suggests, the textual grammar of sentences in use-case description should be in the form of “Subject, Verb, Direct object and Prepositional phrase”. In this form, each sentence says “who did what to whom?” So these three parts of speech (subject, object and verb) in a sentence are important and may refer to data elements and services that we are looking for. Using NLP tools, subject, object, verb and behavioral relations between them are recognized in each sentence. After extracting responsibilities and relationships, we collect them into a graph structure. We call this graph “Problem Domain Semantic Network” (PDSN). Each responsibility is a node on the PDSN and relationships are edges, connecting two related responsibilities. Semantic and structural relations hold between two nodes but behavioral relations involve two or more nodes per relation. To put behavioral relations properly on the PDSN, we use more than one edge. This

ACCEPTED MANUSCRIPT

representation puts the subject node in the center; the first edge connects the subject to the verb with a “does” caption; the second one connects the subject to the object with an “on” caption and other edges connect subject to the other nodes with a “using” caption. Figure 3 shows a partial view of a PDSN built for a case study.

Credit

Is a

has

Select

does

CR IP T

VIN

Person

using

Color

Is a

Customer

Member

on

has has

on

does

Car Model

Car

Rent

has

AN US

member of

has

Description

Make

has

Price

has

Engine Size

Figure 3. A partial view of the PDSN (for iCoot case study).

4.2

ED

M

A challenge that this stage is faced with is that if requirement documents were not written in proper manner, the resulting PDSN will not accurately represent the problem domain and hence will not be that much helpful. Processing use case descriptions automatically imposes it to be written more clear and unambiguous. But in the case of having unwanted ambiguity in use case descriptions, we suggest a domain expert revise the PDSN for possible errors. Optimization: looking for the best design

AC

CE

PT

MAX-MIN Ant System (MMAS) is used as search engine for detail design optimization. Selection of MMAS was upon some reasons: first of all, it is a graph-based search algorithm, and therefore is a good choice for software design problem that can be easily represented as a graph. Second, it is fast converging. Third, MMAS is well suited to solve Quadratic Assignment Problem (QAP) [17] that is similar to design problem. MMAS is not required heuristic information (unlike most of the other variations of ACO) and converges using only pheromone trial information [14], and finally there are implementation frameworks available for ACO algorithms that make it easy to develop. MMAS is used in both single-objective and multiobjective modes. This point forward we refer to single-objective MMAS as “sMMAS” and multiobjective MMAS as “mMMAS”. The mMMAS uses multi-criterion approach, in which each objective has its pheromone matrix, and ants update each pheromone matrix by value of the corresponding objective component of the solution. One of the important issues of using ACO algorithm is the number of ants in use. Usually number of ants is set equal to the number of the problem elements, i.e. the number of cities in TSP [15]. But in the software design problem that the number of problem elements is relatively high, it is not reasonable to follow this routine. Besides, since in the MMAS implemented in this paper, ants use a local search algorithm, there is no need to employ large

ACCEPTED MANUSCRIPT

number of ants [17]. So after experimental tests the number of ants set to a reasonable number, proportional to the problem size. Lin-Kernighan is used as local search by all of the ants in all iterations. The MMAS used pheromone re-initialization in the case of stagnation [14].

AN US

CR IP T

As described in section 2.1, a software detail design consists of classes and relations between them. Each class is a bundle of data elements and services, and a design is any possible assignment of data elements and services to classes. Any acceptable representation of design problem must be able to produce all the possible solutions in the search space. Simons and Smith [60] introduced two representations for design problem using ACO. In the first representation, a candidate solution is represented as a sequence of d (the number of responsibilities) integers, each gets a value from the set {1, 2, … , 𝐶}, where 𝐶 is the maximum number of classes in the design. Each integer in the sequence represent an assignment 𝑎𝑖 = 𝑗, interpreted as putting element 𝑖 ∈ {1, 2, … , 𝑑} into class 𝑗 ∈ {1, 2, … , 𝐶} . This representation is called Naïve Grouping (NG). The second representation which is called Extended Permutation (XP) is inspired by TSP and Vehicle Routing Problem (VRP). Using XP, candidate solutions are represented as permutations of a set of (𝑑 + 𝐶 − 1) element, where the last 𝐶 − 1 elements represent “end of class” markers [60]. NG representation is of 𝑐 𝑑 order and XP of 𝑐 𝑑+𝐶−1 order. Both NG and XP suffer from a redundancy in representation since each design class can be labeled in 𝐶! different but equivalent ways.

CE

PT

ED

M

We developed a novel representation for design problem. This new representation is a Directed Acyclic Graph (DAG). Nodes in this graph are arranged in 2 dimensions. Columns and rows are representative of responsibilities and classes, respectively, and hence selection of each node represents the possible assignment of a responsibility to a class. Each node is considered being connected to all of the nodes in the next column with a directed edge. There is also a start node that points to all of the nodes in the first column. In this way, a path that starts from the start node and ends in a node in the last column, where it passes through every column, is a possible design solution. A sample path and its equivalent design are shown in Figures 4 and 5, respectively. All the edges are directed from left to right but in the favor of clarity, the directions are not presented in Figure 4. Ants will search for a path with objective features on this DAG. In most applications of the ant colony algorithms, ants deposit pheromone on edges but in our representation pheromone deposition happens on the nodes. This representation theoretically is similar to NG but the advantage is that DAG prevents ants from building cycles in their path, thus they don’t need to check for a cycle every time they are going to take another step forward.

AC

SN

A1 (R1)

A2 (R2)

A3 (R3)

A4 (R4)

M1 (R5)

M2 (R6)

M3 (R7)

M4 (R8)

M5 (R9)

C1

C2

C3

R: Responsibility A: Attribute M: Method C: Class SN: Start Node

ACCEPTED MANUSCRIPT

Figure 4. A sample view of DAG, representation of software design problem for ACO.

m1

m2

m3

m4

m5

0 0 0 1 0 0 0 0 0

1 0 1 0 0 0 0 0 0

1 1 0 0 0 1 0 0 0

0 1 0 0 0 0 0 0 0

0 1 0 0 0 0 0 1 0

CR IP T

a1 a2 a3 a4 m1 m2 m3 m4 m5

Figure 5. (Left) Class Diagram equivalent to the path distinguished in Figure 4. (Right) The usage relations, for a design problem that is leaded to the path in Figure 4. In this table each “1” in (i, j)th entry shows that there is a usage relation between the method from ith column and attribute/method from jth row.

AN US

We set a constraint in the optimization process. Each class is forced to have at least one attribute and one method. This constraint prevents ants from building class diagrams that have classes with no attributes, no methods or one responsibility. Another benefit of this constraint is to reduce maximum possible classes in design to 𝑚𝑎𝑥{𝑘, 𝑙}, where 𝑘 is the number of attributes and 𝑙 is the number of methods in design.

CE

PT

ED

M

Once ants have built their solutions, the fitness of the solution is computed via the objective function. The objective function is an ordered weighted average (OWA) of the software design metrics (described in Table 1) plus a new metric introduced in this paper, the Meaningfulness metric (MM). Meaningfulness is to measure how much a class diagram and its elements are understandable for human. Computing this metric is possible with the PDSN. This metric is a linear combination of two measures: (1) Design elements meaningfulness and (2) Traceability. We used a mapping algorithm to find the nearest counterpart of a class in the PDSN. In mapping algorithm, we compute the similarity of the class (from automated design) with the concept (from PDSN). This similarity is computed on number of class members that is related to the counterpart concept in the PDSN with a close relationship with no more than three steps in distance. Average similarity value of classes in a design is used as Design Elements Meaningfulness (DEM) measure. The traceability measure is the number of domain concepts that are appeared in automated design. This measure is computable after mapping. Meaningfulness measures are not exhibitable formally. We created a procedure to compute these measures. The pseudo-codes are presented in Figures 6 and 7. More detail description of Mapping and Similarity procedures is provided in section 4.3.

AC

For the mMMAS we used multi-criterion Ant that uses multiple pheromone matrices, one per objective. Since we have eleven measures to be optimized, and maintaining eleven pheromone matrices in optimization process imposes memory shortage and performance issues, a combination of multi-objective optimization and weighted aggregation is used. Measures from Table 1 aggregated in groups to produce metric values. Then metric values optimized by mMMAS. Although simple averaging is possible and practical, measures aggregation here is a weighted aggregation in which weights are configured by user depending on the problem size and type.

ACCEPTED MANUSCRIPT

Among software design metrics used in the proposed method, the coupling and complexity should be minimized and the cohesion, object-orientedness and meaningfulness should be maximized. Since a minimization cost function is needed for MMAS, we used (1[Maximization Criteria]) values for three maximization metrics. In order to avoid repression of any objective value by others, normalized value of each metric is used in the aggregation function.

CR IP T

Relation (2) shows the overall fitness function for optimization stage. 𝑤𝑖 , 𝑖 = 1. . . 5, are weights determined by OWA operator. These weights satisfy constraints described for OWA weights in section 2.4. 𝑓(𝐷) = 𝑤1 × 𝐶𝑃 + 𝑤2 × 𝐶𝑋 + 𝑤3 × (1 − 𝐶𝐻) + 𝑤4 × (1 − 𝑂𝑂) + 𝑤5 × (1 − 𝑀𝑀)

(2)

1

𝑤𝑖 = + 𝑔(𝑖) 𝑛

AN US

Simple OWA operator is used for weighting objective function. OWA weight vector is generated using relation (3). This weight generator produces weights in descending order with semi-exponential distribution. For 𝑛 = 5 , generated weight vector is 𝑊 = {0.3, 0.25, 0.2, 0.15, 0.1}. (3)

Where 𝑛 is the number of objectives in fitness function

{

𝑖< 𝑖=

−1⁄ 𝑛2𝑛+1−𝑖

𝑖<

𝑛 2 𝑛

2 𝑛+1 2 𝑛+1 2

when 𝑛 is an odd number

when 𝑛 is an even number.

ED

1⁄ 𝑖 and 𝑔(𝑖) = { 𝑛2 −1⁄ 𝑛2𝑛−𝑖

𝑖>

𝑛+1

M

1⁄ 𝑛2𝑖 and 𝑔(𝑖) = 0

𝑖>

2

AC

CE

PT

MMAS uses the fitness function as a cost function to determine the amount of pheromone deposition by the best ant (either the best of the iteration or the best so far). For sMMAS, ants use relation (2) to determine the amount of pheromone deposition on the path they build. In mMMAS, that the pheromone matrices are separated for different objectives, five ants that are best in each objective are selected to update corresponding pheromone matrices. But in the solution construction phase, each ant combines pheromone information from all pheromone matrices to build a solution. For mMMAS we used relation (2) to rank solutions in non-dominate set and then, select one best design. 4.3

Post Processing: Concept Mapping

The output of the optimization stage is a software design solution that is more alike a clustering of responsibilities. These clusters are nameless classes with only usage relations (associations) between them. In the last stage we must name the classes and recognize structural relations, in order to make the resulting design more understandable to human. To do so, we used a mapping between PDSN concepts and design classes. The mapping procedure finds the most similar concept of each class from PDSN. Pseudo-codes for mapping procedure and similarity computing are presented in Figures 6 and 7 respectively. Mapping procedure starts with a class in the design and for each attribute in the selected class, collects a set of

ACCEPTED MANUSCRIPT

concepts from PDSN (where it is possible). Collected concepts either are super-part of the attribute or super-class of the attribute. Super-part is a concept that given attribute is a part of it (i.e. car is a super-part of wheel) and super-class is a concept that the attribute is a kind of it (i.e. vehicle is a super-class of car). At last, the concept that has the maximum occurrence among collected concepts for the class is selected as the candidate concept of the class.

CR IP T

When a candidate concept is found for a class in design, similarity procedure is applied to find out how much the class is similar to the concept. This degree of similarity is used in Meaningfulness measure computation. Similarity procedure gets the class and its corresponding concept in the PDSN (we call it class-concept), and then for each attribute in the class finds the distance between attribute-concept and the class-concept. Attribute-concept is a concept node of the PDSN which is the most similar to the attribute. Since the concept name and attribute name usually are the same, the Similarity procedure uses name-matching to find the attribute-concept. As far the attribute-concept is from class-concept, the value added to the similarity value is less. After summing up the similarity values, it is normalized over the size of the class (the number of attributes is considered).

AN US

For example Figure 8 shows a sample process of the mapping procedure on a small part of an automatically designed class diagram. “Class #2” and “Class #4” in figure 8-A, are a part of class diagram produced by optimization stage that contains nameless classes. Map procedure builds attribute-concept lists for each class, as shown in figure 8-B, and suggests “CarMdel” for “Class #4” and “Car” for “Class #2” based on the PDSN presented in Figure 3. After naming classes by suggested concept-names, relation between two concept-nodes in the PDSN (that is an aggregation) is established between classes in the diagram (see figure 8-C).

ED

M

After mapping process, corresponding PDSN concepts of the design classes are recognized and then the design classes have named after them. Finally the structural relationships between PDSN concepts are assigned to the corresponding design classes. This mapping process is the same process used for metrics value computation of meaningfulness and object-orientedness.

AC

CE

PT

1 Map(DesignDiagram D, SemanticNet PDSN) 2 Begin 3 for each C in D do ‎ //C is a Class in Design diagram D 4 for each attribute in C do 5 attribute.conceptList = ᴓ 6 concept=find attribute or its synonym in PDSN; 7 if concept has a partHolonymConcept then 8 add its partHolonymConcept to attribute.conceptList; 9 if concept has an inheritedHypernym then 10 add its inheritedHypernym to attribute.conceptList; 11 end for 12 candidateConcept = find maxOccuredConcept in all attribute.conceptLists 13 if 2 or more CandidateConcepts are retrived (with the same value of occurrence) then 14 C.CandidateConcept = select one that is higher in hierarchyTree; 15 end for 16 End Figure 6. Pseudo-code for Mapping procedure.

ACCEPTED MANUSCRIPT

CR IP T

1 ComputeSimilarity(Class C, Concept concept) ‎ //C is a Class in Design diagram D and Concept is its counterpart concept in PDSN 2 Begin 3 similarity = 0; 4 for each attribute in C do 5 find concept and attribute-concept in PDSN; 6 if these two concepts have a common concept in relation within max 1 step then 7 similarity +=1; 8 else if in max 3 steps then 9 similarity +=0.5; 10 end for; 11 similarity = similarity/numberOfAttributesInClass(C); 12 return similarity; 13 End Figure 7. Pseudo-code for computing the similarity of a class to its counterpart concept in PDSN.

Class #4

Price Engine Size Description Make

Attribute-Concept List →{ CarModel } →{ CarModel }

AN US

Attribute

→{ CarModel } →{ CarModel }

Class #2 Attribute Color VIN

(B)

M

(A)

Attribute-Concept List →{ Car } →{ Car }

(C)

5.

ED

Figure 8. An example for mapping procedure. (A). Automatically generated classes. (B). Concept Mapping process. (C). Final classes after naming and relation recognition.

Experimental Setup

AC

CE

PT

Evaluation of the proposed method has been done from three aspects. First, we investigate the power of the method in finding the near optimum class diagram. Reporting the metric values obtained by optimization process, considering the measures variety, shows that the proposed method is able to find near optimum values for each measure. Comparing automated design result with human expert design requires an evaluation method. Computing the value of closeness to expert design would be a hint of how realistic and useable is the automated design output. This is the second aspect of the evaluation. The third aspect is to investigate the effect of the main contribution of this paper: using a semantic network of the problem domain knowledge in the automated software design. After a brief description of the implementation, parameter tuning and case studies, results for the evaluations are presented. 5.1

Implementation

Most of the applications are implemented in Java. NLP tool used for text analysis is Semantic Role Labeler (SRL) [61]. SRL is a shallow semantic part of speech tagger developed in Illinois University. SRL, not only recognizes all parts of speech in the sentence, but also shows the semantic connections between words. SRL helps us to recognize behavioral relations between responsibilities automatically. PDSN is implemented using JGraphT [62], a graph

ACCEPTED MANUSCRIPT

library, with another Java application capable of querying on it. MMAS implementation is based on JACSF [63], a Java framework for ACO algorithms, while necessary customizations are made. JAWS [64] library is used to make queries on WordNet 2.1 lexicon. The rest of the applications for harnessing the above mentioned tools are implemented in Java by authors. 5.2

MMAS Parameter Tuning

CR IP T

We have done some experimental tests to investigate the effect of parameter variation in the MMAS results. Range of parameter values that are tested are shown in Table 4‎. Meanwhile there is a good analysis of ACO parameter tuning for automated design problem in Simons and Smith [60]. Despite the fact that the ACO variant and the representation of the problem used in this study and [60] are different, but the effects of the parameter variation on the effectiveness of the algorithm is quite similar. Table 4. The range of parameter values tested for MMAS algorithm parameter tuning. Range

Maximum Iteration

AN US

Parameter

100, 150, 300, 500, 1000, 2000

Number of Ants

4, 8, 12, 20, 30, 40, 50, 100

Lower Initial Pheromone limit (𝜏𝑚𝑖𝑛 )

0.01, 0.02, 0.05, 0.1, 0.2, 0.5, 1

Upper Initial Pheromone limit (𝜏𝑚𝑎𝑥 ) Pheromone evaporation rate

1, 2, 5, 10, 15

0.01, 0.02, 0.05, 0.1 Yes/No

M

Pheromone reset

PT

ED

The number of ants employed in MMAS is restricted to a fraction of the number of the problem elements (i.e. 20 for small-scaled problems, having at most 60 responsibilities, and 50 ants for middle-scaled problems, having 60~150 responsibilities). Pheromone trial bounds and pheromone evaporation rate after experimental tests are set to [0.01, 5] and 0.02, respectively. A brief list of settings is reported in Table 5. The sMMAS uses one pheromone matrix and one colony of ants while mMMAS is used with multi pheromone matrixes.

CE

Like standard MMAS, ants use a coefficient for probabilistically choose between exploration (selecting a probabilistic node to go) and exploitation (using pheromone traces to select the best node to go). This coefficient in our implementation is controlled dynamically by a parameter. As iterations goes on, exploration rate is decreased linearly in favor of

AC

exploitation. This transition between exploration and exploitation rate holds this relation: exploration rate + exploitation rate = 1 and transition will happen gradually between starting iteration and expected ending iteration.

ACCEPTED MANUSCRIPT

Table 5. MMAS algorithm configuration Parameter

Quantity

Number of Pheromone Matrices (Single-Objective)

1

Number of Pheromone Matrices (Multi-Objective)

5

One per objective.

1000

Number of Ants

20 for small-scaled problems, having at most 60 responsibilities and 50 ants for middle-scaled problems, having 60~150 responsibilities

20~50

Ant selection for Updating Pheromone

The pheromone limits are dynamically updated as algorithm goes on.

[0.01, 5]

Pheromone evaporation rate

0.02

-

Lin–Kernighan (partial 2-opt)

Rate = 0.2

AN US

Local search

Selection is dynamic. There is a better chance to iteration best at start and then best so far at the end of the optimization.

Random selection between: best of iteration and best so far

Initial pheromone limits

5.3

-

CR IP T

Maximum Iteration

Description

Case Studies

ED

M

To evaluate the proposed method, we applied it on four case studies. Table 6 shows a brief description of problems used as case studies. Two out of four problems are small-scaled and other two are middle-scaled. The first three problems are used in the literature [60] as case studies for automated design. The last one (iCoot) is selected to study, because its level of documentations is acceptable [4]. iCoot documentations is written in fluent English that makes it suitable for evaluating the proposed method. It is worth mentioning that a few ambiguous sentences are disambiguated by authors before using in the evaluation process. Table 6. Case studies

Title

Number of

data elements

operations

responsibilities

usages

5

16

15

31

39

Graduate Development Program (GDP)

5

43

12

55

107

16

62

30

108

141

21

43

40

83

172

PT

Classes

Cinema Booking System (CBS)

iCoot

Results

AC

6.

CE

Select Cruises (SC)

Analysis of the results for sMMAS and mMMAS confirms that the software design is a multi-objective task and the sMMAS struggles to find the best design in a reasonable number of iterations. Although the time and space cost of the mMMAS are more than the sMMAS, but the better result of the mMMAS justifies its usage. Tables 7-10 show the results of the sMMAS compared with the mMMAS. Tests performed in 30 independent runs and the best value, the average value and the standard deviation of the five metrics are reported. The little arrow under measure names in the tables indicates that the criterion is to be maximized (up arrow) or minimized (down arrow).

ACCEPTED MANUSCRIPT

Table 7. Metric values for CBS case study for single and multi-objective MMAS in 30 independent runs. Bold values are the best of two methods.

sMMAS

Best Avg. St.D.

mMMAS

Best Avg. St.D.

Expert

Coupling MACP MMCP ↓ ↓ 0.08 0 0.10 0 0.01 0 0.08 0 0.10 0 0.01 0 0.10 0

Cohesion DCH BCH ↑ ↑ 0.93 0 0.89 0 0.05 0 0.93 0 0.91 0 0.03 0 0.89 0

DCX ↓ 1.13 1.18 0.05 1.13 1.15 0.03 0.82

Criteria Complexity CSCX ↓ 2.52 3.45 0.09 2.52 3.01 0.05 3.13

CSSD ↓ 2.75 2.84 0.08 2.75 2.78 0.05 1.47

Object-Oriented ADC AHH ↑ ↑ 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Meaningfulness Meaningfulness ↑ 0.82 0.82 0 0.82 0.82 0 0.94

CR IP T

Method

Table 8. Metric values for GDP case study for single and multi-objective MMAS in 30 independent runs. Bold values are the best of two methods.

sMMAS

Best Avg. St.D.

mMMAS

Best Avg. St.D.

Expert

Coupling MACP MMCP ↓ ↓ 0.18 0 0.23 0 0.08 0 0.18 0 0.20 0 0.03 0 0.32 0

Cohesion DCH BCH ↑ ↑ 0.82 0 0.79 0 0.12 0 0.82 0 0.81 0 0.05 0 0.81 0

DCX ↓ 0.16 0.19 0.03 0.12 0.16 0.02 0.21

Criteria Complexity CSCX ↓ 2.56 2.88 0.26 2.25 2.75 0.19 3.62

CSSD ↓ 1.45 1.89 0.34 0.98 1.54 0.26 4.47

Object-Oriented ADC AHH ↑ ↑ 0.20 0.20 0.05 0.05 0.20 0.20 0.20 0.20 0.08 0.08 0.22 0.22 0 0

AN US

Method

Meaningfulness Meaningfulness ↑ 0.64 0.56 0.75 0.72 0.68 0.35 0.88

Table 9. Metric values for SC case study for single and multi-objective MMAS in 30 independent runs. Bold values are the best of two methods.

sMMAS

Best Avg. St.D.

mMMAS

Best Avg. St.D.

ED

Expert

Cohesion DCH BCH ↑ ↑ 0.69 0.12 0.65 0.09 0.06 0.00 0.82 0.12 0.77 0.10 0.06 0.00 0.79 0.12

M

Method

Coupling MACP MMCP ↓ ↓ 0.23 0.04 0.20 0.04 0.04 0.01 0.15 0.02 0.18 0.04 0.02 0.01 0.18 0.02

DCX ↓ 1.22 1.72 0.41 1.12 1.65 0.40 1.83

Criteria Complexity CSCX ↓ 2.78 2.98 0.21 2.44 2.67 0.18 2.64

CSSD ↓ 1.33 1.75 0.16 1.28 1.32 0.09 4.27

Object-Oriented ADC AHH ↑ ↑ 0.33 0.33 0.28 0.28 0.19 0.19 0.33 0.33 0.30 0.30 0.12 0.12 0.23 0.23

Meaningfulness Meaningfulness ↑ 0.69 0.61 0.23 0.73 0.64 0.18 0.77

Table 10. Metric values for iCoot case study for single and multi-objective MMAS in 30 independent runs. Bold values are the best of two methods.

Best Avg. St.D.

mMMAS

Best Avg. St.D.

CE

sMMAS

PT

Method

Coupling MACP MMCP ↓ ↓ 0.20 0.05 0.25 0.08 0.03 0.02 0.15 0.00 0.19 0.06 0.03 0.02 0.17 0.05

Expert

Cohesion DCH BCH ↑ ↑ 0.75 0.46 0.70 0.45 0.05 0.12 0.85 0.56 0.81 0.51 0.02 0.09 0.82 0.56

DCX ↓ 0.42 0.47 0.01 0.39 0.45 0.02 0.57

Criteria Complexity CSCX ↓ 1.28 1.44 0.21 1.11 1.19 0.09 2.76

CSSD ↓ 3.22 3.54 0.18 3.02 3.12 0.05 3.68

Object-Oriented ADC AHH ↑ ↑ 0.16 0.21 0.12 0.18 0.22 0.22 0.16 0.21 0.14 0.20 0.18 0.18 0.10 0.10

Meaningfulness Meaningfulness ↑ 0.88 0.85 0.12 0.92 0.90 0.10 0.92

AC

As there is no method-method calls for both CBS and GDP, the MMCP and the BCH measure values are 0. As shown in Tables 6-9, the mMMAS in most of cases achieves a better result than the sMMAS, although in a few cases they are equal. The mMMAS usually has a better average value than the sMMAS, revealing that production of better results by the mMMAS happens more frequent, and thus the mMMAS is more reliable than the sMMAS. Figures 9 and 10 show the progress of two typical runs of the sMMAS and the mMMAS on the CBS case study. It is clear that the mMMAS converges faster than the sMMAS because in the mMMAS ants take the advantage of multiple pheromone matrixes that have become specialists in their own objectives. While in the sMMAS, conflicting objectives counterbalance each other, which lead to tardier convergence. The jumps in the diagrams are the result of pheromone reinitialization.

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1

51

101

151

201

251

301

Iteration Number

CR IP T

Coupling Value

ACCEPTED MANUSCRIPT

351

401

451

501

Figure 9. A typical run for sMMAS (coupling value) on CBS case study. 0.8 0.7 0.5 0.4 0.3 0.2 0.1 0 1

51

101

151

201

AN US

Coupling Value

0.6

251

301

351

401

451

501

M

Iteration Number

Figure 10. A typical run for mMMAS (coupling value) on CBS case study.

Comparison With Human Expert Design

ED

6.1

AC

CE

PT

To compare the proposed automated design method with human expert design, we use a form of F-Score, a measure of a test’s accuracy inspired by information retrieval and statistical analysis. To do so, we adopted F-Score computation from [8] that introduced specially for computing the degree of similarity between automated design and human expert design. The F-Score value of 1 indicates that two designs are identical –thus the F-Score for expert design is equal to 1 - and 0 indicates the absence of any similarity between two designs. Relation (4) shows the F-Score computation formula, where AD stands for automated design and HD for Human Design, Ca is any class from automated design and Ch is any class from human design. 𝐹𝑆𝑐𝑜𝑟𝑒(𝐴𝐷) = ∑∀𝐶𝑎𝑖 ∈𝐴𝐷

|𝑅(𝐶𝑎𝑖 )| 𝑟

× 𝑚𝑎𝑥∀𝐶ℎ𝑗 ∈𝐻𝐷 {𝐹𝑀𝑒𝑎𝑠𝑢𝑟𝑒(𝐶𝑎𝑖 , 𝐶ℎ𝑗 )}

(4)

Where: 𝐹𝑀𝑒𝑎𝑠𝑢𝑟𝑒(𝐶𝑎𝑖 , 𝐶ℎ𝑗 ) = 2 × 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛(𝐶𝑎𝑖 , 𝐶ℎ𝑗 ) =

𝑅𝑒𝑐𝑎𝑙𝑙(𝐶𝑎𝑖 ,𝐶ℎ𝑗 )×𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛(𝐶𝑎𝑖 ,𝐶ℎ𝑗 ) 𝑅𝑒𝑐𝑎𝑙𝑙(𝐶𝑎𝑖 ,𝐶ℎ𝑗 )+𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛(𝐶𝑎𝑖 ,𝐶ℎ𝑗 )

|𝑅(𝐶𝑎𝑖 )∩𝑅(𝐶ℎ𝑗 )| |𝑅(𝐶𝑎𝑖 )|

,

, 𝑅𝑒𝑐𝑎𝑙𝑙(𝐶𝑎𝑖 , 𝐶ℎ𝑗 ) =

|𝑅(𝐶𝑎𝑖 )∩𝑅(𝐶ℎ𝑗 )| |𝑅(𝐶ℎ𝑗 )|

In general, we do not know how many runs we need to gain a reliable result. Rice [65] recommended at least 30 runs. So, we ran the sMMAS and the mMMAS 30 times and reported

ACCEPTED MANUSCRIPT

CR IP T

average, standard deviation and the best obtained F-Score for each method in Table 11. Figure 11 shows the best F-Score for case studies using the sMMAS and the mMMAS. It is seen that the mMMAS results are closer to human expert design. This similarity for the small-scaled problems is higher than middle-scaled problems. To emphasize on the power of the mMMAS optimization when the problem size is growing, Figure 12 shows the F-Score multiplied by problem size. As Figure 12 shows, the mMMAS produces better results in comparison with the sMMAS, especially when the size of problem is larger. The largest F-Score belongs to iCoot case study because quality of its documentation is better than the others and therefore the quality of its equivalent PDSN is better. Obviously the high quality of PDSN has a positive effect on reaching to a higher similarity with expert design. Table 11. F-Score values of resulting automated designs for different case studies. F-Score for human design is 1. Method

Case Study

sMMAS Best Avg. St.D. Best Avg. St.D. Best Avg. St.D. Best Avg. St.D.

Select Cruises

M ED

sMMAS mMMAS

CBS

PT

F-Score

iCoot

GDP

SC

iCoot

Case Study

CE

Figure 11. Comparison of F-Score for sMMAS and mMMAS for different case studies. F-Score for human design is 1.

F-Score X Problem Size

Graduate Development Program

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

mMMAS

AN US

Cinema Booking System

0.8800 0.8512 0.0125 0.8400 0.7985 0.1002 0.6230 0.5877 0.0150 0.8102 0.7888 0.0245

0.8800 0.8625 0.0112 0.8602 0.8215 0.0752 0.7600 0.7420 0.0124 0.9218 0.8915 0.0245

100 90 80 70 60 50 40 30 20 10 0

sMMAS mMMAS

CBS

GDP

SC

iCoot

Case Study Figure 12. Comparison of F-Score, considering the problem size, for sMMAS and mMMAS.

AC

As a sample of the proposed method result, the best automated design for CBS case study is shown in Figure 13 and the human design for this case study is shown in Figure 14. The coupling measure for human design is 0.10, where the value of this measure for the automated design is 0.08. The automated method reduced the coupling value by 20% percent, by combining Showing and Screen classes. This improvement in coupling value however resulted in reduction of similarity between two designs, which produces the F-Score value of 0.88 for automated design. Besides the merging of two above mentioned classes, the automated design moved a method (i.e. lookUpPayment) from the Payment class to the Booking class. The automated naming process selected “Ticket” as the name of the Booking class, and “Card” as the name of the Payment class. Also this is a difference with the human design, but with a wise look, one can say that the new names are somehow suitable.

Figure 13. The best automated design for CBS case study, by the proposed model.

AN US

CR IP T

ACCEPTED MANUSCRIPT

Figure 14. Human design for CBS case study

[47].

1

PT

0.9

ED

M

To study the effect of problem domain knowledge usage in the optimization stage, we tested mMMAS with and without using the PDSN. In the former version (as we call it “mMMAS+PDSN”) PDSN is used for computing the object-orientedness and meaningfulness metrics as well as structural relations recognition and class naming process, but in the latter version (“mMMAS-PDSN”) we just used NLP tools to recognize responsibilities and behavioral relations. Association is the only relationship recognized between classes and no mapping stage is performed. Figure 15 shows the F-Score value for the best automated design achieved by the proposed method in 30 runs. As shown in the diagram, using PDSN we see a distinguishable effect on the automated design result and it approaches to a human expert designer’s result.

0.8 0.6

CE

F-Score

0.7 0.5

mMMAS-PDSN

0.4

mMMAS+PDSN

AC

0.3 0.2 0.1 0

CBS

GDP

SC

iCoot

Case Study

Figure 15. Comparison of F-Score for mMMAS with and without use of PDSN, for different case studies. F-Score for human design is 1.

ACCEPTED MANUSCRIPT

6.2

Statistical Analysis

CR IP T

To claim a difference between the two algorithms, we use a popular statistical test: Wilcoxon-Mann-Whitney test (or mean Mann-Whitney U-test) [65] to compare the average results of two algorithms. The Wilcoxon- Mann-Whitney test is non-parametric and it makes no assumption on the distribution of the data. A null hypothesis H0 is typically defined to state that there is no difference between two algorithms. On the contrary, an alternative hypothesis H1 is contrasted with H0. There are two possible situations when the statistical testing [65] is performed: (1) we reject H0, i.e., there is a difference between two algorithms, and (2) we accept H0, i.e., there is no difference. The p-value is the probability of observing the given sample result under the assumption that H0 is true. The significant level α of a test is the highest pvalue we accept for rejecting H0. Typically, in the experimental sciences, the degree of chance that we find acceptable is chosen α = 0.05. To perform the Wilcoxon- Mann-Whitney test, we use the function “wilcox.test(X,Y, paired=T)” in R programming language, where the X and Y are the result sets of two compared randomized algorithms.

AN US

For example, the mean F-Score result sets obtained by sMMAS and mMMAS of 30 runs for the Cinema Booking System case study are respectively reported as X= {0.88, 0.87, 0.87, 0.87, 0.862, 0.86, 0.86, 0.86, 0.86, 0.86, 0.858, 0.856, 0.85, 0.85, 0.85, 0.85, 0.85, 0.85, 0.85, 0.85, 0.84, 0.84, 0.84, 0.84, 0.84, 0.84, 0.84, 0.83, 0.83, 0.83} and Y= {0.88, 0.88, 0.88, 0.88, 0.88, 0.87, 0.87, 0.87, 0.87, 0.87, 0.87, 0.87, 0.87, 0.86, 0.86, 0.86, 0.86, 0.86, 0.86, 0.86, 0.86, 0.86, 0.85, 0.85, 0.85, 0.85, 0.845, 0.845, 0.845, 0.84}.

M

Applying Wilcoxon test on X and Y sets leads to the p-value of 0.64, then, because the p-value is greater than 0.05, there is not enough evidence to reject H0 and conclude the statistical equality of the means of two groups X and Y i.e., the mean F-Score for the Cinema Booking System case study of sMMAS and mMMAS, respectively.

PT

ED

Applying Wilcoxon test on W and R sets, i.e., the mean F-Scores of sMMAS and mMMAS for the Graduate Development Program case study for 30 runs, leads to the p-value of 3.325 e17, then, because the p-value is lower than 0.05, we have evidence to reject H0 and the mean FScore achieved by mMMAS, is higher than that of sMMAS, for the Graduate Development Program case study.

CE

Applying Wilcoxon test on other results sets of sMMAS and mMMAS shows that for both Select Cruises and iCoot case studies, the obtained p-values are lower than 0.05, then, we have evidence to reject H0, therefore, for both of these case studies, their mean of F-Scores obtained by mMMAS is higher than that of sMMAS.

AC

Therefore, it is concluded that performing Wilcoxon non-parametric inferential statistical test shows that for three case studies the performance of mMMAS is higher than that of sMMAS, and for one case study the performance is equal. 7.

Analysis and Discussion

Expecting a program to design software, and do it as well as an experienced software engineer, requires a deep emphasizes on how the task is done by humans. In the beginning a human designer uses his/her prior knowledge and gets to know the problem domain. After developing a conceptual map of problem domain entities, human designer considers the requirements and constraints and starts to design a class diagram (in object-oriented paradigm). The proposed method simulates designer’s conceptual map by applying the PDSN.

ACCEPTED MANUSCRIPT

CR IP T

The PDSN is extracted from requirements and is enriched with extra real world concepts that are not mentioned in the documents but exist in the problem domain. Correctness and completeness of the PDSN is totally depended on the input material that is fed into NLP tools, i.e. requirements documents. In the other words a poor documentation of the software requirements, will result in a weak automated design by the proposed method. Results showed that using a quality PDSN enables the proposed method to compete with other automated methods that enjoy human design expert’s interaction. In the proposed method, the only interaction with human being is cooperation of a domain expert that contributes by editing the PDSN, when it is needed.

AN US

Object-oriented software design metrics are a handy tool to distinguish good designs from weak ones, but it seems that they are not quite enough for automated design approaches. Lack of metrics for evaluation of design meaningfulness in automated design approaches, resulted in violation of an object-oriented fundamental rule that says: “every object should be an abstract of real world entity” [5]. As long as human make the design, this rule holds because of the way human thinks, but when we employ a program to design the software, we need to think of a way to hold this rule. Results showed that the introduced metric of meaningfulness had a reasonable effect on guiding the optimization process toward classes more close to real world entities. This advantage provides us a design that is more reusable, readable, extendable and therefore maintainable by humans.

ED

M

Since the proposed method is dependent on the quality of the requirements textual documents of the software, this issue restrains the application of this method on real world problems, unless users put enough effort on the production of requirements documents. On the other hand, the existence of domain ontology for the problem domain is helpful in complement with the documentation, although we didn’t use this option in experiments. Despite the fact that preparing domain ontology demands high cost and time-effort, but it is worth to prepare because nowadays having such an asset for a domain, opens new doors to many applications that require ontology (such as AI tools).

CE

PT

Making use of the new representation of design problem (the DAG) for MMAS has some advantages. This novel representation allows ants to produce any possible designs in the search space and explore through it extensively. Moreover, ants are restricted to produce cycles in their path and therefore, they don’t need to check repeatedly if there are making a cycle or not.

AC

Comparing the result of the proposed method with the other methods requires a more similar context of use and operation. With respect to the lack of such a context, the only possible comparison is F-Score obtained by Masoud and Jalili [8] on the SC case study, which is a higher value of 0.8998 comparing with our attained value of 0.7600. The difference in the value can be result of difference in the input material to the methods. Input material of Masoud and Jalili [8] is responsibilities and use relations extracted from use case documents by human, where in the proposed method in this paper the input material is use case document free format text, that is analyzed by automated methods to extract responsibilities and relations. The difference in the F-Score on the SC case study is considerable, because this case study is considered as a middle-scaled problem with such documentation that is not well suited to automatic analyzes. In comparison with SC, iCoot is a middle-scale problem which enjoys a better documentation that helped automated method to attain a high F-Score value of 0.9218. With respect to this fact, we conclude that, if documentation of the SC was more

ACCEPTED MANUSCRIPT

expressive and complete, the proposed automated method was able to come up with better result for this case study 8.

Conclusion

AN US

CR IP T

This paper is an attempt to automatic software design using meta-heuristic methods that seeks to reduce the role of human designers in the detail design process. We used MaxMin Ant System as optimization algorithm in both single-objective and multi-objective modes. As a novel approach, the proposed method used NLP tools to process software textual requirements documents and build a semantic network of problem domain (PDSN), to be used in optimization phase and then for naming the classes and detecting relationships between the classes. Results revealed that using a problem domain semantic network to make-up for human designer’s knowledge of domain, is pretty helpful. Introduction of new software design metric - called Meaningfulness - for guiding search in more favorable area of the search space resulted in an output design that is more similar to human design. Hence the output automated design enjoys a better chance of maintainability by human in its life cycle. Although the Meaningfulness metric is immature, but obviously developing such a set of metrics for using in the automated design in future studies could have a beneficial effect on this approach. In comparison with [60] special representation of problem introduced here, helps MMAS

ED

M

to be more quick and efficient. The results from [60] on the SC case study showed that as the software problems gets larger, ACO struggles to achieve superior fitness because large problems with more responsibilities have a large search space with more unacceptable solutions. Small difference between better results for small-scaled and middle-scaled problems tested in this study proves that the mMMAS successfully restrains the search space growth through efficient representation. In both space and time complexity, the new representation is more thrifty that makes the proposed method capable of applying on larger problems, without considerable loose of efficiency.

PT

The main challenge of the proposed method is the quality of the PDSN. Since we used NLP tools to extract building blocks of the PDSN, from free form textual documents written in natural language, the quality of the PDSN is depended on the quality of the documents. The results showed that for case studies which their requirements documents are well expressive and complete the resulting automated design is more similar to human expert design.

CE

REFRENCES

AC

[1] J. Arlow and I. Neustadt, UML 2 and the unified process: practical object-oriented analysis and design. Pearson Education, 2005. [2] I. Jacobson, G. Booch, J. Rumbaugh, J. Rumbaugh, and G. Booch, The unified software development process, vol. 1. Addison-Wesley Reading, 1999. [3] I. Gorton, Essential software architecture, vol. 14. Springer, 2006. [4] B. Bruegge and A. H. Dutoit, Object-Oriented Software Engineering Using UML, Patterns and Java(Required). Prentice Hall, 2004. [5] A. J. Riel, Object-oriented design heuristics. Addison-Wesley Publishing Company, 1996. [6] O. Räihä, “A survey on search-based software design,” Comput. Sci. Rev., vol. 4, no. 4, pp. 203–249, 2010.

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN US

CR IP T

[7] M. Harman, S. A. Mansouri, and Y. Zhang, “Search based software engineering: A comprehensive analysis and review of trends techniques and applications,” Dep. Comput. Sci. King’s Coll. Lond. Tech Rep TR-09-03, 2009. [8] H. Masoud and S. Jalili, “A clustering-based model for class responsibility assignment problem in object-oriented analysis,” J. Syst. Softw., vol. 93, pp. 110–131, 2014. [9] R. J. Abbott, “Program design by informal English descriptions,” Commun. ACM, vol. 26, no. 11, pp. 882–894, 1983. [10] M. Bowman, L. C. Briand, and Y. Labiche, “Multi-objective genetic algorithm to support class responsibility assignment,” in Software Maintenance, 2007. ICSM 2007. IEEE International Conference on, 2007, pp. 124–133. [11] G.-C. Rota, “The number of partitions of a set,” Am. Math. Mon., pp. 498–504, 1964. [12] H. S. Wilf, Generatingfunctionology. Boston: Academic Press, 1994. [13] W. Stadler, Multicriteria Optimization in Engineering and in the Sciences, vol. 37. Springer, 1988. [14] M. Dorigo, Ant colony optimization. Cambridge, Mass: MIT Press, 2004. [15] T. Stützle and H. H. Hoos, “MAX–MIN ant system,” Future Gener. Comput. Syst., vol. 16, no. 8, pp. 889–914, 2000. [16] M. Dorigo, E. Bonabeau, and G. Theraulaz, “Ant algorithms and stigmergy,” Future Gener. Comput. Syst., vol. 16, no. 8, pp. 851–871, 2000. [17] T. Stützle and H. Hoos, “The max-min ant system and local search for combinatorial optimization problems,” in Meta-heuristics, Springer, 1999, pp. 313–329. [18] M. López-Ibáñez and T. Stützle, “An experimental analysis of design choices of multi-objective ant colony optimization algorithms,” Swarm Intell., vol. 6, no. 3, pp. 207–232, Sep. 2012. [19] S. Iredi, D. Merkle, and M. Middendorf, “Bi-Criterion Optimization with Multi Colony Ant Algorithms,” in Proceedings of the First International Conference on Evolutionary Multi-Criterion Optimization, London, UK, UK, 2001, pp. 359–372. [20] C. García-Martínez, O. Cordón, and F. Herrera, “A taxonomy and an empirical analysis of multiple objective ant colony optimization algorithms for the bi-criteria TSP,” Eur. J. Oper. Res., vol. 180, no. 1, pp. 116–148, Jul. 2007. [21] R. S. Pressman and W. S. Jawadekar, “Software engineering,” N. Y. 1992, 1987. [22] N. E. Fenton and S. L. Pfleeger, Software metrics: a rigorous and practical approach. PWS Publishing Co., 1998. [23] M. Lorenz and J. Kidd, Object-oriented software metrics: a practical guide. Prentice-Hall, Inc., 1994. [24] S. R. Chidamber and C. F. Kemerer, “A metrics suite for object oriented design,” Softw. Eng. IEEE Trans. On, vol. 20, no. 6, pp. 476–493, 1994. [25] K. Lieberherr, I. Holland, and A. Riel, “Object-oriented programming: An objective sense of style,” in ACM SIGPLAN Notices, 1988, vol. 23, pp. 323–334. [26] H. Zuse, Software complexity: measures and methods. Walter de Gruyter & Co., 1990. [27] G. A. Miller, “The magical number seven, plus or minus two: some limits on our capacity for processing information.,” Psychol. Rev., vol. 63, no. 2, p. 81, 1956. [28] K. El Emam, N. Goel, W. Melo, H. Lounis, S. N. Rai, and others, “The optimal class size for objectoriented software,” Softw. Eng. IEEE Trans. On, vol. 28, no. 5, pp. 494–509, 2002. [29] M. Lanza, R. Marinescu, and S. Ducasse, Object-oriented metrics in practice. Springer, 2006. [30] R. R. Yager, “On ordered weighted averaging aggregation operators in multicriteria decisionmaking,” Syst. Man Cybern. IEEE Trans. On, vol. 18, no. 1, pp. 183–190, 1988. [31] Recent developments in the ordered weighted averaging operators: theory and practice, 1st ed. New York: Springer, 2011. [32] R. Fuller, “On Obtaining OWA Operator Weights: A Sort Survey of Recent Developments,” 2007, pp. 241–244.

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN US

CR IP T

[33] Z. S. Xu and Q. L. Da, “An overview of operators for aggregating information,” Int. J. Intell. Syst., vol. 18, no. 9, pp. 953–969, Sep. 2003. [34] L. K. Schubert, “Extending the expressive power of semantic networks,” Artif. Intell., vol. 7, no. 2, pp. 163–198, 1976. [35] C. Fellbaum, WordNet. Wiley Online Library, 1998. [36] W. Miller and D. L. Spooner, “Automatic generation of floating-point test data,” IEEE Trans. Softw. Eng., vol. 2, no. 3, pp. 223–226, 1976. [37] M. Harman and B. F. Jones, “Search-based software engineering,” Inf. Softw. Technol., vol. 43, no. 14, pp. 833–839, 2001. [38] “SBSE Repository.” [Online]. Available: http://crestweb.cs.ucl.ac.uk/resources/sbse_repository/. [Accessed: 21-Jun-2014]. [39] M. Fowler, Refactoring: improving the design of existing code. Pearson Education India, 1999. [40] M. Bowman, L. C. Briand, and Y. Labiche, “Solving the class responsibility assignment problem in object-oriented analysis with multi-objective genetic algorithms,” Softw. Eng. IEEE Trans. On, vol. 36, no. 6, pp. 817–837, 2010. [41] G. Glavaš and K. Fertalj, “Solving the Class Responsibility Assignment Problem Using Metaheuristic Approach,” CIT J. Comput. Inf. Technol., vol. 19, no. 4, pp. 275–283, 2011. [42] D. K. Saini and Y. Sharma, “Soft Computing Particle Swarm Optimization based Approach for Class Responsibility Assignment Problem,” Int. J. Comput. Appl., vol. 40, no. 12, 2012. [43] O. Räihä, K. Koskimies, and E. Mäkinen, “Genetic synthesis of software architecture,” in Simulated Evolution and Learning, Springer, 2008, pp. 565–574. [44] O. Räihä, H. Kundi, K. Koskimies, and E. Mäkinen, “Synthesizing architecture from requirements: A genetic approach,” in Relating Software Requirements and Architectures, Springer, 2011, pp. 307– 331. [45] C. L. Simons and I. C. Parmee, “A cross-disciplinary technology transfer for search-based evolutionary computing: from engineering design to software engineering design,” Eng. Optim., vol. 39, no. 5, pp. 631–648, 2007. [46] C. L. Simons and I. C. Parmee, “User-centered, evolutionary search in conceptual software design,” in Evolutionary Computation, 2008. CEC 2008.(IEEE World Congress on Computational Intelligence). IEEE Congress on, 2008, pp. 869–876. [47] C. L. Simons, I. C. Parmee, and R. Gwynllyw, “Interactive, evolutionary search in upstream objectoriented class design,” Softw. Eng. IEEE Trans. On, vol. 36, no. 6, pp. 798–816, 2010. [48] D. Dori, N. Korda, A. Soffer, and S. Cohen, “SMART: system model acquisition from requirements text,” in Business Process Management, Springer, 2004, pp. 179–194. [49] L. Mich, “NL-OOPS: from natural language to object oriented requirements using the natural language processing system LOLITA,” Nat. Lang. Eng., vol. 2, no. 02, pp. 161–187, 1996. [50] L. Mich and R. Garigliano, “NL-OOPS: A requirements analysis tool based on natural language processing,” in Proceedings of Third International Conference on Data Mining Methods and Databases for Engineering, Bologna, Italy, 2002. [51] H. M. Harmain and R. Gaizauskas, “Cm-builder: A natural language-based case tool for objectoriented analysis,” Autom. Softw. Eng., vol. 10, no. 2, pp. 157–181, 2003. [52] X. Zhou and N. Zhou, “Auto-generation of Class Diagram from Free-text Functional Specifications and Domain Ontology,” 2008. [53] P. R. Kothari, “Processing Natural Language Requirement to Extract Basic Elements of a Class,” ISSN2249-0868 Found. Comput. Sci. PCS N. Y. USA, vol. 3, no. 7, 2012. [54] C. Larman, Applying UML and Patterns: An Introduction to Object-Oriented Analysis and Design and Iterative Development, 3/e. Pearson Education India, 2012. [55] L. Maciaszek, Requirements analysis and system design. Pearson Education, 2007.

ACCEPTED MANUSCRIPT

AN US

CR IP T

[56] S. Harabagiu, G. Miller, and D. Moldovan, “Wordnet 2-a morphologically and semantically enhanced resource,” in Proceedings of SIGLEX, 1999, vol. 99, pp. 1–8. [57] M. Fowler, UML distilled: a brief guide to the standard object modeling language. Addison-Wesley Professional, 2004. [58] A. Cockburn, “Basic use case template,” Hum. Technol. Tech. Rep., vol. 96, 1998. [59] A. Cockburn, Writing effective use cases, The crystal collection for software professionals. AddisonWesley Professional Reading, 2000. [60] C. L. Simons and J. E. Smith, “A comparison of meta-heuristic search for interactive software design,” Soft Comput., vol. 17, no. 11, pp. 2147–2162, Nov. 2013. [61] V. Punyakanok, D. Roth, and W. Yih, “The importance of syntactic parsing and inference in semantic role labeling,” Comput. Linguist., vol. 34, no. 2, pp. 257–287, 2008. [62] B. Naveh and others, “Jgrapht,” Internet Httpjgrapht Sourceforge Net, 2008. [63] U. Chirico, “A java framework for ant colony systems,” in Ants2004: Forth International Workshop on Ant Colony Optimization and Swarm Intelligence, Brussels, 2004. [64] “Java API for WordNet Searching (JAWS).” [Online]. Available: http://lyle.smu.edu/~tspell/jaws/. [Accessed: 21-Jun-2014]. [65] J. Rice, Mathematical statistics and data analysis. Cengage Learning, 2006.

Bioghraphy:

ED

M

Vali Tawosi received his M.Sc. degree in Software Engineering from Tarbiat Modares University (TMU) in 2013 and his B.Sc. degree in Computer Science from PN University in 2009. His main research interests are computational intelligence, object-oriented analysis and design, and search-based software engineering (SBSE).

AC

CE

PT

Saeed Jalili received his Ph.D. degree from Bradford University in 1991 and the M.Sc. degree in computer science from Sharif University of Technology in 1979. Since 1992, he has been associate professor at the Tarbiat Modares University (TMU). His main research interests are software runtime verification, quantitative evaluation of software architecture, search-based software engineering (SBSE), machine learning and software model checking.

Seyed Mohammad Hossein Hasheminejad received his Ph.D. in computer engineering at Tarbiat Modares University (TMU) in 2014. He received his M.Sc. degree in Software Engineering from TMU in 2009, and the B.Sc. degree in Software Engineering from Tarbiat Moalem University in 2007. His main research interests are formal methods for software engineering, object-oriented analysis and design, search-based software engineering (SBSE), and self-adaptive systems.

AC

CE

PT

ED

M

AN US

CR IP T

ACCEPTED MANUSCRIPT