Feature analysis using information retrieval, community detection and structural analysis methods in product line adoption

Feature analysis using information retrieval, community detection and structural analysis methods in product line adoption

The Journal of Systems and Software 155 (2019) 70–90 Contents lists available at ScienceDirect The Journal of Systems and Software journal homepage:...

16MB Sizes 1 Downloads 54 Views

The Journal of Systems and Software 155 (2019) 70–90

Contents lists available at ScienceDirect

The Journal of Systems and Software journal homepage: www.elsevier.com/locate/jss

In Practice

Feature analysis using information retrieval, community detection and structural analysis methods in product line adoption András Kicsi a, Viktor Csuvik a, László Vidács a,∗, Ferenc Horváth a, Árpád Beszédes a, Tibor Gyimóthy a, Ferenc Kocsis b a b

Department of Software Engineering and MTA-SZTE Research Group on Artificial Intelligence, University of Szeged, Hungary SZEGED Software Ltd., Szeged, Hungary

a r t i c l e

i n f o

Article history: Received 20 April 2018 Revised 12 March 2019 Accepted 2 May 2019 Available online 4 May 2019 Keywords: Software product line Feature extraction Information retrieval Community detection

a b s t r a c t In industrial practice the clone-and-own strategy is often applied when in the pressure of high demand of customized features. The adoption of software product line (SPL) architecture is a large one time investment that affects both technical and organizational issues. The analysis of the feature structure is a crucial point in the SPL adoption process involving domain experts working at a higher level of abstraction and developers working directly on the program code. We propose automatic methods to extract feature-to-program links starting from very high level set of features provided by domain experts. For this purpose we combine call graph information with textual similarity between code and high level features. In addition, in depth understanding of the feature structure is supported by finding communities between programs and relating them to features. As features are originated from domain experts, community analysis reveals discrepancies between expert view and internal code structure. We found that communities correspond well to the high level features, with usually more than half of feature code located in specialized communities. We report experiments at two levels of features and more than 20 0 0 Magic 4GL programs in an industrial SPL adoption project. © 2019 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license. (http://creativecommons.org/licenses/by/4.0/)

1. Introduction Maintaining parallel versions of a software satisfying various customer needs is challenging. Many times the clone-and-own solution (Fischer et al., 2014) is chosen because of short term time and effort constraints. As the number of product variants increases, a more viable solution is needed through systematic code level reuse. A natural step towards more effective development is the adoption of product line architecture (Clements and Northrop, 2001). Product line adoption is usually approached from three directions: the proactive approach starts with domain analysis and applies variability management from scratch. The reactive approach incrementally replies to the new customer needs when they arise. When there are already a number of systems in production, the extractive approach seems to be the most feasible choice. During the extractive approach the adoption process



Corresponding author. E-mail addresses: [email protected] (A. Kicsi), [email protected] (V. Csuvik), [email protected] (L. Vidács), [email protected] (F. Horváth), [email protected] (Á. Beszédes), [email protected] (T. Gyimóthy), [email protected] (F. Kocsis).

benefits from systematic reuse of existing design and architectural knowledge (Krueger, 2002). An advantage of the extractive approach in general is that several reverse engineering methods exist to support feature extraction and analysis (Kästner et al., 2014; Assunção and Vergilio, 2014; Eyal-Salman et al., 2012). We report on an ongoing product line adoption project where the new architecture is based upon an existing variant and the specific functions of the others are being merged into the final architecture (while the whole process exceeds the scope of this work). In principles it is closest to the extractive approach. Our subject is a legacy high market value, wholesaler logistics system, which was adapted to various domains in the past using clone-and-own method. It is developed using a fourth generation language (4GL) technology, Magic (Magic Software Enterprises Ltd.), and in this project the product line architecture is to be built based on an existing set of products developed in the Magic XPA language. Although there is reverse engineering support for usual maintenance activities (Nagy et al., 2010; 2011b), the special structure of Magic programs makes it necessary to experiment with targeted solutions for coping with features. Furthermore, approaches used in mainstream languages like Java or C++ need to be re-considered in the case of systems developed in 4GLs. For instance, in the traditional

https://doi.org/10.1016/j.jss.2019.05.001 0164-1212/© 2019 The Authors. Published by Elsevier Inc. This is an open access article under the CC BY license. (http://creativecommons.org/licenses/by/4.0/)

A. Kicsi, V. Csuvik and L. Vidács et al. / The Journal of Systems and Software 155 (2019) 70–90

sense there is no source code, rather the developer sets up user interface and data processing units in a development environment and the flow of the program follows a well-defined structure. In the focus of our work is the feature identification and analysis phase of the project. This is a well studied topic in the literature for mainstream languages (Assunção and Vergilio, 2014), but the same for 4GL is less explored. The method starts from a very high level set of features provided by domain experts, and uses information extracted from the existing program code. The information retrieval (IR) approach to feature extraction as a single method turned out to be noisy in Magic 4GL system analysis (Kicsi et al., 2017). Hence the extraction is performed by combining and further processing call graph information on the code with textual similarity (IR) between code and high level features. Essentially the method is working simultaneously with structural (syntactic) and conceptual (text based) information, similarly that have been previously proposed for traditional object oriented systems (Al-msie’deen et al., 2013). While most related literature deals with object oriented systems, our goal is to aid the product line adoption of an existing 4GL system. This does not only bring a distinction from the language perspective but also a less general and more industrially motivated viewpoint. While similar methods can be used as in object oriented environments, these often have to be modified. IR based methods have less problems with dealing with the different paradigms but with structural information it can be difficult to achieve the same results as with a more straightforwardly structured other system. Besides combining textual extraction with structural one, we present an efficient method for filtering the data from both sources as well. This results in a set of information that is more suitable for performing the SPL adoption by various stakeholders of the project including domain experts and architects. In summary, the contributions of this paper are the following: •





A method for feature extraction by combining syntactic and textual information and using filtering results of the two sources. Feature analysis methods by applying community detection algorithms to match features with call graph communities. Application of the approach in an industrial setting in 4GL environment during a product line adoption project.

The main goal of our work is to aid our industrial partner in its current product line adoption task. While our results may have the potential to be used in case of other 4GL languages or other feature extraction work, we do not attempt to introduce a completely foolproof and fully versatile approach for every situation. The paper is structured as follows. Section 2 introduces our current task and defines our approach to tackle the problem. Section 3 provides more detailed information on our methods for feature extraction and presents our various result sets. Section 4 describes our experiments on call graph communities. Since we aim to facilitate work an analysis of the project’s progress can be found in Section 5. We evaluate our various feature extraction outputs and the community detection’s information value in Section 6 with the help of two research questions that we seek answers to. We overview the related work in Section 7 and conclude our paper in Section 8. 2. Feature extraction and abstraction of magic applications 2.1. Product line adoption in a clone-and-own environment The decision of migrating to a new product line architecture is hard to make. Usually there is a high number of derived specific products and the adoption process poses several risks and may take months (Clements et al., 2006; Clements and Krueger,

71

2002; Catal and Cagatay, 2009). The subject system of our analysis is a leading pharmaceutical wholesaler logistics system started more than 30 years ago. Meanwhile almost twenty derived variants of the system were introduced at various complexity and maturity levels with independent life cycles and isolated maintenance. Our partner is the developer of market leading solutions in the region, which are implemented in the Magic XPA fourth generation language. This work is part of an industrial project aiming to create a well-designed product line architecture over the isolated variants. The existing set of products provide an appropriate environment for an extractive SPL adoption approach. Characterizing features is usually a manual or semi-automated task, where domain experts, product owners and developers co-operate. Our aim is to help this process by automatic analysis of the relation of higher level features and map program level entities to features. The 4GL environment used to implement the systems requires different approaches and analysis tools than today‘s mainstream languages like Java (Nagy et al., 2010; Harrison and Lim, 1998). For example, there is no source code in its traditional sense. The developers work in a fully fledged development environment by customizing several properties of programs. Magic program analysis tool support is not comparable to mainstream languages, hence this is a research-intensive project. The feature extraction process is challenged, since the 19 product variants themselves are written in 4 different language versions, Magic V5, Magic V9, UniPaaS 1.9 and XPA 3.x. In case of the oldest Magic V5 systems, there is a high demand on the migration to a newer version. UniPaaS 1.9 introduced huge changes in the language by using the.NET engine for applications. Most systems are implemented in that version. The newest Magic XPA 3.x line of the language lies close to the uniPaaS v1.9 systems. Each variant is between 20 0 0 and 40 0 0 Magic programs in size with a large amount of code in common. Magic is a data-intensive language, which is clearly reflected by the program code as well, the variants containing 822 models and 1065 data tables in maximum. 2.2. Feature extraction approach During product line adoption’s feature extraction phase various artifacts are obtained to identify features in an application (Ballarin et al., 2016). This phase is also related to feature location. The analysis phase targets common and variable properties of features and prepares the reengineering phase. This last phase migrates the subject system to the product line architecture. In this part of the research project, we aim at feature extraction and analysis. Our inputs are the high level features of the system and the program code. We apply a semi-automated process as in the work of Kästner et al. (2014). High level features are collected by domain experts from the developer company. The actual task is to establish a link between features and main elements of the Magic applications. Although there exists a common analysis infrastructure for reverse engineering 4GL languages (Nagy et al., 2011a; 2010; 2011b), the actual program models differ. Fig. 1 illustrates our current approach to feature extraction. We assign a number of elements for each high level feature, this information helping the work of developers and domain experts working on the new product line architecture. This is the feature extraction phase which can be seen at the center of the figure with the green feature nodes and the yellow program nodes are organized in graphs with their connections shown. During the assignment, we mainly rely on structural information attained on call dependency by constructing a call graph of the programs of a variant. This results in a high number of located elements, crucial for the development of product line architecture, but the large amount of data can be hard to grasp in its entirety. We

72

A. Kicsi, V. Csuvik and L. Vidács et al. / The Journal of Systems and Software 155 (2019) 70–90

Fig. 1. An illustration of feature extraction and analysis as part of product line adoption.

Fig. 2. Illustration on the elements of a Magic application.

combine this method with information retrieval, which can also make it easier to cope with a 4GL language by utilizing conceptual connections, and is successfully applied in software development tasks, such as in traceability scenarios for object oriented languages (Marcus and Maletic, 2003). A comprehensive overview of Natural Language Processing (NLP) techniques – including Latent Semantic Indexing (LSI), the technique we chose – is provided by Falessi et al. (2010). In our previous work (Kicsi et al., 2017) we already presented our LSI-based approach to feature extraction. LSI is already known to be capable of producing good quality results combined with structural information (Al-msie’deen et al., 2013). To further assist the work of domain experts, we analyze communities based on the call graph of the entire system. This opens the way to compare the view of domain experts on features with code level communities that reflect the implementation of those. Community detection algorithms address large networks and have been used for software engineering problems like handling dependencies (Hamilton and Danicic, 2012) and support modularization (Mitchell and Mancoridis, 2006), but we are not aware of their application in product line research. 2.3. The structure of a magic application Being a fourth generation language, Magic does not completely follow the structure of a traditional programming language. It has a lot of different elements, in this subsection we seek to introduce the only ones necessary for the understanding of our approach. Fig. 2 aims to present these elements. A software written in Magic is called an application. These can be built up from one or more projects. In turn, each project can have any number of programs which contain the actual logic of

the software. The tasks branching directly from a project are called programs. These can have their own subtasks and be called anywhere in the project like methods in a traditional programming environment, but their subtasks can only be called by the (sub)task containing them. Any task or subtask can access data through tables. It is also possible for a program to be called through menus, which are controls designed to provide user intervention and usually start a process by calling programs. Having sufficient information on menus, we used these as a base for the call graph in structural feature extraction, deriving calls from menus. 3. Feature extraction experiments 3.1. Overview In this section, we present the feature extraction methods used based on call dependency and textual similarity, as well as the combination and possible filtering options. Fig. 3 illustrates the processes described in this section. Our static analysis is specific to the Magic language. One of the variants of our subject system was selected by domain experts to be used as a starting point for the product line adoption. It is a specific variant involving 4251 programs, 822 models and 1065 data tables. The new product line was started from this variant, the capabilities of the other variants are being built into the product line during the adoption process. We have been provided with a feature list structured in a tree format initially consisting of three levels which have 10, 42 and 118 unique elements respectively. This list was being refined in an iterative manner. From these we chose the upper level to display our results, the features of this level are listed in Fig. 4. The numbers

A. Kicsi, V. Csuvik and L. Vidács et al. / The Journal of Systems and Software 155 (2019) 70–90

73

Fig. 3. A more detailed view on the feature extraction process.

Fig. 4. The higher level features of the system.

tual similarity. A more complete summary of our feature extraction work with LSI is presented in our previous work (Kicsi et al., 2017). Similarly to the CG technique, we also used information retrieval for determining connections between features and programs of the system. Though the structural information obtained from call graph is more thorough, it is more suitable for the developers rather than domain experts. With purely structural information it is hard to separate along the features, having many programs laying the groundwork for any single feature it is hard to grasp the overall aim. This conceptual analysis separates more agreeably along the semantics of the feature, hence it can be more valuable for domain experts. A graph representation of textual similarity results can be seen later on the right side of Fig. 7. 3.3. Results and further possibilities

Fig. 5. The process of calculating the call graph.

shown here are in accordance with the numbers we present in our later graph examples. 3.2. Approach 3.2.1. Feature extraction using (task) call dependency This approach relies on the call dependencies between programs and tasks. To construct a call graph from these dependencies we use the process illustrated in Fig. 5. The figure represents a minimalistic example of a Magic application. Squares mean tasks and programs, while other program elements like projects, logic units, logic lines, etc. are shown as circles. From the source code we construct the abstract semantic graph (ASG), which is provided by our static source code analyzer tool. As the next step, we add the call edges to the graph by examining Magic elements that operate as calls between tasks and programs. Finally, in the last two steps of the process we eliminate some nodes and edges from the graph, keeping only the necessary ones i.e., call edges, tasks, and programs. From the CG we obtain the features by running a customized breadth-first search algorithm from specific starting points determined by menu entries. The domain experts assembling the feature list know the menu structure well, hence we considered the given feature-menu connections a good starting point for call graph construction. You can see a graph representation of the CG based results later on the left side of Fig. 7. 3.2.2. Textual similarity By the textual nature of information retrieval (IR) techniques, they are less susceptible to various hardships some other techniques face like the language used. In our case, one of the biggest problems is that the systems processed use different versions of the Magic language. Thus IR can contribute to a more versatile approach. For this purpose, we used the Latent Semantic Indexing technique (Deerwester et al., 1990) for measurement of tex-

As already introduced, the two methods we used for program assignment to features use fundamentally different methods for achieving their results. Consequently, the results themselves also show a significant difference, overlapping only partially. The set of programs for the techniques presented are shown on the left side of Fig. 6. Each slice of the diagram represents a top level feature and its colors indicate the number of programs detected by each technique. IR represents the result set of the information retrieval technique, CG represents the pairs attained by call graph, while ESS represents the set of programs considered most essential, detected by both techniques. The left side of Fig. 8 shows the number of programs assigned in each set, the abbreviations match the ones explained for the previous figure. The call graph dependency technique provides a high number of programs for each feature as displayed on the left side of Fig. 7. These relations are based on real calls of the code itself which can be considered a reasonable source of information. It is important to note that we only used static call information, thus at runtime not every call occurs inevitably. Developers need to work with programs, hence they are required to have some readily available information on all of them. The call information which this technique uncovers is likely to benefit their basic understanding of the programs, but it also presents a problem of coping with the large amount of data not really distinguishable in any manner. The conceptual method produces fewer programs for each feature as can be seen on the right side of Fig. 7 where we show a graph representation of the IR-based results. Further examination of random cases revealed that even considering this, a significant amount of noise presents itself. Textual similarity works with very little information in these cases, hence it is likely for similar wording or more general words like “list” to produce misleading matches, occurring in the text of many features. So this method can connect radically different parts of a system both in the aspect of features and the system structure, hence we consider this conceptual method less precise. Looking at only the intersection of the connections found by these two techniques we find that this set of connections takes

74

A. Kicsi, V. Csuvik and L. Vidács et al. / The Journal of Systems and Software 155 (2019) 70–90

Fig. 6. The size of our result sets for each feature with each technique. Results without filtering shown on the left, results with filtering on the right. (For interpretation of the references to color in the text, the reader is referred to the web version of this article.)

Fig. 7. Graph visualization of the set of results obtained by the call graph (Left) and the information retrieval (Right) technique.

Fig. 8. Graph visualization of the set of programs deemed most essential. Results shown on the left, results with filtering on the right.

into account both the structural and conceptual information, producing only connections which are indeed present on both levels. This results in a clearer, more straightforward set of connections, which contains the most essential findings of the two techniques. As we could see before, the structural information produces a rather large amount of matches for each feature, and we observed that there is a considerable overlap between features. We decided to attempt to clear these matches too with a filtering technique

applied on the structural information output, which filters out less specific programs. The filtering technique works with a number n, which denotes the maximal number of features a program can connect before it is considered less specific and is filtered out from the program set of features. This removes the programs with less information value and results in even more straightforward groups of programs for each feature. We have to note that less specific programs are often no less important, but their relations are harder

A. Kicsi, V. Csuvik and L. Vidács et al. / The Journal of Systems and Software 155 (2019) 70–90

to comprehend, which is our focus in the current case. On the right side of Figs. 6 and 8 we can see the results of the common structural and conceptual connections of this filtered approach, featuring only programs with maximum two connections. It is apparent from the graph that features are much better separated, providing a suitable high level glance at the background of features without a lot of technical details, ideal for top level understanding. Examining the graphs we can come to many interesting conclusions. For example feature number 7 is behaving like any other feature considering the purely conceptual or purely structural viewpoint, its common graph provides a clearer picture, apparently connecting through a group of more general features to a large number of other features. In the filtered case however, it is nicely separated with a group of unique programs specific to the feature itself.

4. Feature analysis using call graph communities 4.1. Overview Community detection provides a grouping of programs with denser inner connections. As already established, feature detection also results in a grouping of the programs, just as community detection. This means that we can compare these two sets of results in a relatively easy way. This comparison can be useful in many ways enabling us to reach new information about the feature model and even to refine the output of the feature extraction process itself. In this section, we present our experiments with various community detection algorithms and investigate their possibilities in aiding product line adoption when combined with the previously constructed outputs of feature extraction.

4.2. Approach Communities are groups of nodes located in a graph. They are more densely connected internally than to the rest of the graph. They are often researched topics of network science and contribute highly to scientific and industrial research. Groups of densely connected nodes often exist in almost every network. Since communities are not strictly defined, several different correct groupings of graph nodes can exist. There are a large number of community detection algorithms used, often even specialized to a specific field, like social networks. In our current research, we consider the programs of the system as the nodes of the graph and do the community detection on them. The edges of the graph are the calls the programs make extracted by static analysis. The detected features are not indicated in the graph and play no part in the community detection process. As the call edges are more dense inside the communities, we can expect that these programs work together more closely and thus perform similar tasks. This is exactly what characterizes features too, so we would expect significant overlaps in communities and detected features. There is a key value in community detection, modularity, which is a scale value between -1 and 1 that represents the density of the edges inside communities compared to the edges outside communities. Theoretically, optimizing this value results in the best possible groups of nodes. Computing this completely is rather complex, thus in practice we rely on different heuristic algorithms. During our experiments we considered the following community detection algorithms provided by the R software environment: Edge Betweenness: Edge betweenness scores are the number of shortest paths that pass through an edge. The algorithm removes the edges by a decreasing order of these scores.

75

Table 1 Results of our analysis with the five community algorithms for top level features. Algorithm

NoC

MFC

NoCSC

AFC

Edge Betweenness Fast Greedy Leading Eigenvector Louvain Walktrap (t=40)

123 44 29 32 57

31.81% 31.81% 45.35% 36.36% 40.91%

6 7 9 10 6

45.84% 57.09% 61.75% 52.91% 56.63%

Fast Greedy: Each node starts as an individual community, and these are merged together in a locally optimal manner considering the largest increase in modularity. Leading Eigenvector: In each step the graph is split into two parts in a way that maximizes modularity. This is determined by the leading eigenvector of the modularity matrix computed on the graph. Louvain: A multi-level algorithm, consists of repetitions of greedily assigning locally optimized small communities and then considering these assigned groups as individual nodes. Walktrap: Starts out random walks from the nodes, and because these tend to leave communities less often, determines community merges according to them. It has a parameter t which represents the number of steps in each random walk. To quantify the results of these community algorithms, we decided to use some metrics that can contribute to the understanding of the output. One obvious metric is the number of communities assigned. Different algorithms produced a varying number of communities on the same data. For easier handling we experimented with changing this circumstance and demanding a previously defined number of communities. This approach did not do well since it often produced a large number of tiny and some oversized communities which separated the graph very badly and this provided no real information value. We computed the following metrics for further investigation: NoC: Number of communities. MFC: The maximal coverage of a single feature. NoCSC: Number of communities with significant coverage. We consider coverage significant if the community covers at least 10% of the programs of a feature. AFC: The average percentage of programs of features covered by communities with at least 10% coverage 4.3. Results 4.3.1. Comparison of community algorithms Table 1 presents the results of the metric values measured on the top level of features. Each row represents the results of a different community algorithm. For Walktrap we have chosen a t=40 walk length which according to the results of different experiments provided the most suitable number of communities. The Leading Eigenvector method also works with a parameter, in this case, we have chosen 29 for similar reasons. Fig. 9 shows the number of communities at each value of the parameters of these techniques listed from 2 to 250. A large number of communities can present too much granularity especially for comparison with the top level of features since we have only ten features on this level. Having a greater number of features can make the large number of communities beneficial. The results of the table indicate that Edge Betweenness detects a significantly larger amount of communities than the other algorithms. These are usually very small with a few larger communities. The MFC value represents the largest coverage of a single feature, meaning that for example in the Edge Betweenness case there was a community which covered 31.81% of a single feature, which was the largest value in this scenario. As we can see this number can vary greatly through different al-

76

A. Kicsi, V. Csuvik and L. Vidács et al. / The Journal of Systems and Software 155 (2019) 70–90

Fig. 9. Number of communities at various parameter settings of the Walktrap and Leading Eigenvector methods.

Fig. 10. The proportions of communities determined in case of each feature at top level with Walktrap (t = 40) detection and a 10% feature coverage filtering.

gorithms. This number however only describes the coverage of a single community rather than a representation of all. NoCSC, on the other hand, provides an exact number on this property, representing the number of communities that cover at least 10% of a feature. As we can see this is not a large value in any case. The lowest values were achieved by the algorithms with the highest NoC values, which is not surprising since these qualities can be somewhat opposite. Since there is more granularity, the smaller groups are bound to cover less from each feature. AFC represents the total coverage achieved by each feature on average considering communities with at least 10% of a feature covered. As we can see this number moves around 55% with Edge Betweenness differing significantly, which is most probably also due to the high granularity. This means that on average more than half of the programs of a feature are located in communities containing a significant part of that feature. It can suggest that the main functionalities of the features are mostly achieved in communities specialized to that single feature or some larger communities which cover an important part of several features. Fig. 10 illustrates the rate of coverage for each feature by communities that represent at least 10% of the feature’s activity. 4.3.2. Feature analysis using communities Domain experts can play a significant part in the adoption of a new product line architecture. To their work complete knowledge of the features is invaluable. Aiding this, community detection can highlight previously unknown aspects of the systems. As the com-

munity detection is based on actual call edges of the call graph, the programs attached to each other are indeed meant to be interacting during the proper run of the system. With this knowledge, we can assume that the programs working towards the same goal should have more calls to each other, which makes communities an ideal and easy way to form groups of these programs. Possessing knowledge of groups of programs inside the system can also aid the testing process greatly. Testing the product line usually tests functionalities the system provides, with other words it tests the features themselves. While the features are identified with proper feature extraction, testing is still done on the code itself. As a consequence of this, testing a single feature can involve programs that are not foreseen by domain experts. Community detection can provide a way for domain experts to form realistic expectations of the involved parts of the system thus representing an additional valid and not overly complex perspective about the system. As mentioned before, we have experimented with five different community algorithms. Of these we have chosen Walktrap because it produced a manageable amount of communities which still seemed relatively unanimous. We remark that all community detection algorithms produced similar results and we did not see any significant differences that could not be mainly caused by simply the number of communities identified. One important advantage of Walktrap can be that it can produce a variable number of communities. Changing its parameter we can get more, which can improve the situation on lower feature levels where more features

A. Kicsi, V. Csuvik and L. Vidács et al. / The Journal of Systems and Software 155 (2019) 70–90

77

Fig. 11. Three communities with well distinguishable goal. Each node represents a program of the system, while its color marks its feature affiliation. See the node text for cases of more than one features assigned. The edges are the calls the programs make while the circular groups represent their assigned communities. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

are present which are better represented with more smaller communities rather than a few medium sized ones. A graph illustrating top level feature results with Walktrap communities can be found in the appendix in Fig. A1, we will also analyze some of these in the current section. Each node represents a program, the edges are the calls the programs make according to the call graph. The colors of nodes represent the features that they were assigned to in the call graph based feature extraction. This display is not complete since there are programs with more than one features and we could only color the nodes according to one of these. However, the names of all assigned features appear on the nodes, so there is no information loss. Communities are represented by the location of the nodes, each community forms a circle of its programs. We use these same notations in all the following community examples. Generally, we can say about the whole graph that various communities are present. As with every community detection method, we can observe some really large communities that usually involve programs of a lot of different features. On the other hand, we can see many smaller ones with less variance. For example, the three communities presented in Fig. 11 are not particularly large but each can be considered as belonging mainly to a single feature. It is apparent that the left community is part of Administrator interventions, the central community belongs to Supplier order management, as does the community on the right but it also takes part in the work of several other features. Orange nodes tend to be more general in the whole graph, supporting a lot of features, with only four communities having programs that only deal with Access management. This can mean that Access management is usually more interwoven with other features. The same can be said about Manufacturing with an even higher extent. Administrator interventions, on the other hand, owns a great number of communities exclusively. While small differences inside a community can be present, each community detection algorithm produces a high number of communities that are very similar to these, centered around different features. In the current level, almost every top level feature has at least one community it is very dominant in except two of the features, Manufacturing and Master file maintenance. Interface only seems to appear in communities with a lot of Access management programs which can suggest its heavy reliance on Access management. In Fig. 12 we can see another example taken from the graph. We can see a large community which consists of programs of many

features and also a medium sized one with two dominant features, making several calls to programs of the large community. The figure only contains the edges that have both endpoints in these communities but it is already apparent that there are two programs inside the large community which are targeted by a vast amount of program calls. If we take a look at the full picture in the appendix we can see that there are still many more call edges to these programs from various communities. Since these programs represent a lot of features this suggests that they are some of the most essential programs that almost every feature relies on. They can also be the cause for the detection of such a large community since a lot of programs rely solely on them. It is also interesting that while targeted by a lot of calls they do not seem to make any. This implies that they provide some service that is routinely needed rather than play a vital controlling part. We can also see some gray nodes that feature extraction did not assign any features to. This can potentially mean abandoned or not yet used code or simply a result of the imperfectness of a feature extraction process. It is not apparent from the graph examples but these nodes do serve as an endpoint to a call edge since programs that neither make or are targeted by calls do not appear. Communities could also extend feature extraction outputs since if we encounter featureless programs inside a community with a highly dominant feature we can suspect that it serves the same feature. Applying community detection algorithms on the programs of a Magic application we have found that the communities overlap with our feature extraction output. Apart from a few large, more general communities, we can detect several clusters of programs that are specialized to achieve a single feature. Our investigation showed that the majority of top level features are represented by at least one community of their own. While sometimes different specialized communities also arise or some disappear, the same conditions can be observed with all 5 community detection algorithms we have involved in the experiment. 4.3.3. Analysis at deeper level of features The results presented above only involve the top level features we have been provided by domain experts. These features can be further detailed to a second feature level and the analysis can be executed here also. This can result in a more comprehensive picture. A further advantage of community detection on a lower level is that the algorithms used do not depend on feature level, and

78

A. Kicsi, V. Csuvik and L. Vidács et al. / The Journal of Systems and Software 155 (2019) 70–90

Fig. 12. A larger, more general community involving a lot of features and a medium one with two main features. Each node represents a program of the system, while its color marks its feature affiliation. See the node text for cases of more than one features assigned. The edges are the calls the programs make while the circular groups represent their assigned communities. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

they tend to identify more communities than the number of top level features at our disposal. Usually, there are a lot of very small communities. These tend to perform subtasks of the same task and also belong to the same feature. Since they are small, it would be a fair assumption that they may represent lower level features working together for the same smaller goal, contributing to the top level feature. Making a comparison of lower level features and these smaller communities can highlight or disprove this. Additionally, there are larger communities which are likely to involve more lower level features, these can be seen as working more closely together. Thus community detection can highlight valuable information at lower feature levels about both lower and top level features alike. With this, we can gain more information about the actual inner working of the system which is worthy of examination. Level 2 features can also present more interesting information because the top level of features is often considered too general for actual work while second level features represent a more detailed picture of the actual functions the features provide. Since there are only 10 features at the top level while there are 49 on the second, community detection at this level can also lead us to valuable observations. Results of second feature level community detection are fully available in the appendix in Fig. B1. Three of its communities are presented here in Fig. 13. The feature names in these figures were omitted because of the length of their qualified names and their large quantity. Instead of their names we have displayed numbers representing their level 2 features. While a lot of programs still represent only one feature, is not rare here for one to belong to tens of features. At the second level, it is observable that specialized communities are still present in large numbers. Programs that only serve one feature tend to occur together mostly with only a few programs that also work on some additional features. The

communities presented in the example are actually the same as the ones we have seen in Fig. 11, now with the second level of features. Apparently, no significant changes have occurred. From one aspect this is not surprising since the call edges themselves did not change, only the feature classification of programs. This means that the communities are bound to remain the same, only consist of programs belonging to different features. The programs belong to the same top level feature as before, now assigned to a more specific subfeature. This means that adopting a more specific level of subfeatures did not change the inner variability of these communities. The same circumstances can be observed in general, most of the specialized communities retained this quality. On the other hand, two major differences can be seen from top level. One of these is that programs belonging to a large number of top level features seem to have gained even more subfeatures in this scenario. This means that these programs take part in even more features. The interesting aspect of this is that there seems to be no middle ground, a program either supports just a selected few features or almost every second level feature. According to this, we could easily divide programs into two categories, specialized and general programs. The other difference is that the large communities seem to have gained even more variance, they contain a significantly higher amount of different features which means that their features consist of more subfeatures. Considering their size this is not surprising yet now it is apparent that while these subfeatures rely more on programs of other subfeatures, the programs of the more specialized communities tend to work mostly in separation. This can be an important distinction and can provide information of value in many cases. For example, if we are contemplating a change, we can see how much unintended impact that change can have for each subfeature. Specialized communities tend to be more isolated,

A. Kicsi, V. Csuvik and L. Vidács et al. / The Journal of Systems and Software 155 (2019) 70–90

79

Fig. 13. Three communities from second level feature extraction.

hence they can be more easily changed or turned off as a feature. This can be crucial both for product line adoption and for maintenance reasons. Adopting a deeper level means that the number of features increases significantly which could potentially divide specialized communities and produce more general ones. As we have found this is rarely the case. These communities seem to retain their good quality of being largely dominated by a single feature even on lower levels. As the number of features rose from 10 to 49 we experienced no significant change in the rate of specialized versus more general communities. This can even verify the feature model itself since the programs belonging together according to the feature list seem to also work more closely together in reality. 5. Insights into the progress of SPL adoption In this section, we provide information of how the project progressed through time and examining the effects of our work. The system and the feature list are constructed in an iterative manner, hence we can compare their versions. We have several versions of both at our disposal, with 4 separate versions of the system under construction and 7 stages of the feature list. We are going to refer to the system versions now as SV1 to SV4 and the feature list versions as FV1 to FV7. Let us look at the changes made on the feature list first. The rates of feature additions and removals can be seen in Fig. 14 where the circles correspond to the transitions between versions. Each circle consists of four tracks which represent the changes on their specific feature level. The levels are in an increasing order from the center, level 1 represents the broadest categorization of features, while level 4 is the most detailed view. We depicted the menus here as a fourth level since through most of our work they were manageable accordingly. As the list was being developed features got added to or removed from each level of the list. Though there is no guarantee that a new iteration necessarily produced a better feature list, domain expertise is still the most reliable source of information in these cases. The changes presented here are the modifications the domain experts deemed necessary. Let us take a more detailed look at the figure by addressing some of the major changes. The FV2 to FV3 transition introduced a few new features, the most interesting of which are two new level 1 features. In the FV4 to FV5 transition, a great number of features were added, level 2 got 94 new features, level 3 got 223, while 98 new menus were introduced. This represents a major update to the

feature model and the new additions make up a significant part of the new feature list. Some deletions also happened, most notably one first level feature has been removed. In the FV5 to FV6 transition, we can see the removal of 20 level 2, 20 level 3 features and 15 menus as well. The number of level 2 features was much more severely decreased in the FV6 to FV7 transition where 72 features, more than half of its total (including many of the relatively new ones) have been removed. Apart from complete additions or removals, a significant rate of these changes stem from splitting up an original feature or merging two of them. We have also experienced several more special cases where features got “promoted” or “demoted” to another feature level, including even the level of menus. We examined these changes and have detected 9 feature promotions in total while 2 of the features got demoted. None of the features transitioned two levels or moved twice. Fig. 15 illustrates all of these feature movements that happened during the feature list version transitions currently referenced. Let us now address the versions of the system under construction. We have access to four different, relatively recent versions of the system. The versions of the feature list and the versions of the systems had different milestones, hence these two sets of data are not completely synchronized. As the building of the system itself follows the feature list, the feature list should always be considered more up to date than the current state of the system. Fig. 16 illustrates the progress of the project during the time the feature list and system versions examined in this section were created. We can also see a variant depicted as base variant, this is the chosen variant the product line is being built upon. There is also a feature list version with the name of FV0, this is the feature list we did our experiments on and what we chiefly write about in the previous and future sections. There have been several other versions of both feature lists and systems, these can be seen in the figure as the initial development iterations. This period also introduced considerable changes, even the first level of the feature list was expanded with three new features. From the timeline, it is apparent that the feature list milestones followed each other quite rapidly, while the development generally took up more time. Considering system versions, the properties of each currently referenced milestone can be seen in Fig. 17. The columns represent the number of programs of each feature, thus display information about the size of a feature. We can see at first glance that during this time no extraordinary changes were made, but we can detect some small changes at each version transition of

80

A. Kicsi, V. Csuvik and L. Vidács et al. / The Journal of Systems and Software 155 (2019) 70–90

Fig. 14. The changes made at each transition of feature list versions.

Fig. 15. A summary of feature level transitions in the seven feature list versions, the green blocks with arrows on the left represent promotions while the red blocks with arrows on the right represent demotions. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Fig. 16. The timeline of created feature list and system versions currently referenced.

the system at every single feature. The largest differences present themselves at the latest version change, particularly in the case of the Customer order management feature and much less prominently at the Administrator interventions feature where it experienced a significant rise in the number of programs. This number tells us a lot about which features have been modified through

time but it is not complete, since the programs themselves can be subjects to modifications as well, thus the consequences of the changes do not necessarily show up in the numbers. That is why we also displayed the complexity of these features at each version, these are represented by the lines of the figure. In our previous work (Kicsi et al., 2018) we have described our complexity mea-

A. Kicsi, V. Csuvik and L. Vidács et al. / The Journal of Systems and Software 155 (2019) 70–90

81

Fig. 17. A comparison of the four versions of the system regarding their number of programs (NP) and their complexity (HD).

surement techniques regarding these features, we presented the Halstead Difficulty metric for features in Magic systems, which is basically an aggregation of the difficulty metrics of all programs of a feature. This metric basically reflects complexity as a measurement of fault sensitivity. From the figure, we can see that considering complexity the most prominent change also happened at the last version transition of the system and - like we have seen at the feature sizes - also at the Customer order reception feature where there is a major decrease in the complexity of the feature. This means that the then newly added 125 programs are generally less complex and fault sensitive than the feature’s previous programs. Several other minor changes and tendencies are also visible, the size of nearly every feature seems to be in a very small growth while the complexity seems to show a very slight decrease through time in the case of most features. As it was visible from the timeline (Fig. 16), the system versions follow the feature list changes much more slowly, the development is still underway. It is nonetheless apparent that the feature list and system versions seem to be headed in the same direction. While comparing them we can see that in the earlier versions (see Fig. 4) there are several features (e.g. Feature Complaint management and Deposits at the top level) that are not even represented in the call graph of the system (see Fig. 7) because they do not even make a single call. Comparing the latest examined versions however we can detect much less of these only nominally existing features. This change is certainly welcome since it indicates that the system’s progression indeed follows the feature list relatively well.

6. Evaluation In the current section, we aim to evaluate our methods by comparing our results to ones given by human experts working for our industrial partner. We involved two developers and two domain experts working on the project and asked them to fill out our questionnaires based on their expertise. These experiments were conducted retrospectively on the feature version identified as FV0 and the base variant both referenced in previous sections. For the evaluation we propose the following research questions:

RQ1: To what extent our structural and conceptual information based feature extraction methods contribute to the correct mapping of features? RQ2: What additional, not inherently available feature coupling information do communities reveal? 6.1. Feature extraction outputs For RQ1 we have devised the following evaluation process: We have chosen 18 programs at random from each of our main feature extraction outputs and also outside all of the outputs. We shuffled these programs into a single list summing up to 72 programs. This process was performed with three different randomly chosen features, thus three lists were assembled with 216 programs overall. The programs were represented both by their names and their internal identifiers so the developers could also look them up from the system itself. We asked two developers familiar with the system variants to separately judge by what extent each program is connected to the feature it was listed under. The developers were only provided with the programs listed for each of the three features and have been provided knowledge only on the possible choices they could make with no further explanation of the size or number of different program sets involved. The results gathered with this questionnaire are displayed in Fig. 18. We can see the developer opinions grouped into four separate diagrams by the three chosen features and also an overall sum of these. The first developer’s impressions are displayed on the left side of each diagram while the opinions of the second developer are presented on the right. IR depicts the output of our information extraction based approach, CG represents the output based on the call graph while ESS depicts the programs extracted both by the IR and CG processes and are considered most essential for comprehension. The programs represented by the Unrelated set are chosen from outside of these other sets. The sets contained 18 programs each and are all disjoint from each other. Maintaining these criteria the 18 programs were chosen randomly from each set. We have allowed -1 as an option for the case a developer could not come to a conclusive answer, thus there are several cases where the sum of our three differently colored columns does not come up to 18. We still have a minimum of 15 evaluated programs for each case.

82

A. Kicsi, V. Csuvik and L. Vidács et al. / The Journal of Systems and Software 155 (2019) 70–90

Fig. 18. Evaluation of our feature extraction outputs according to two developers.

Fig. 19. The sum of evaluation answers given by the developers.

It is visible that in most cases the developers found our combined results the most relevant of the sets provided, the ESS set displaying both the highest green and lowest red columns in the majority of cases. It is also clearly visible that the Unrelated set containing programs not featured in any of our outputs scored the worst in every single case. The only divergence from the success of the ESS results comes up at the Quality Control feature where their results seem to be highly surpassed by the values of CG. There are significant differences between the answers given by developers, this partly stems from their own definitions of a program being connected to a feature and being relevant. For example, it can also be seen that in general Developer 1 seems to consider the IR results significantly more valid and relevant than Developer 2. In Fig. 19 we summarize the answers the developers provided on these program sets. We can see that the developers generally considered the combined results to be the best as this set scored the least in invalid connections and scored highest in both valid feature connection categories. We can see a similar difference between CG and IR as well as IR usually performed a little better than CG, though as already noted this preference was not equal among the two developers. The Unrelated set scored the worst in every case which is not surprising but still demonstrates that our outputs hold valid information and can be used as a reference.

Answer to RQ1: Based on the evaluation results our outputs hold valid feature extraction information in overall at least 60% of matches based on the opinion of developers. Our output combining structural and conceptual information contains the more relevant programs of a feature, with more than 70% valid matches with over 45% also being relevant in overall. At least the third of these combined matches were relevant in every single case. 6.2. Communities Seeking an answer to our second research question we also decided on manual evaluation. Since our community detection based output aims to contribute mainly to the work of domain experts, we asked two domain experts working on the project to provide answers regarding the mutual dependency of features. We listed the 10 high-level features introduced in Section 3 and asked the experts to fill out a table in accordance with their views on how likely it is that a change occurring in either of two features would lead to a need for the necessity of also changing the other feature. The possible answers were zero for unlikely, one for uncertain and two for certain. Fig. 20 presents the results of the questionnaire, the answers given by the two domain experts are marked red and blue. The

A. Kicsi, V. Csuvik and L. Vidács et al. / The Journal of Systems and Software 155 (2019) 70–90

83

Fig. 20. The answers given by domain experts and a comparison of their average with community based assessment. (For interpretation of the references to color in the text, the reader is referred to the web version of this article.)

color is darker if the domain experts have thought the dependency certain and lighter if they found it uncertain, the white fields represent the choice for unlikely. To make the community data measurable we decided to quantify to what extent two features belong to the same communities. We computed their relative weight inside each community compared to other features and we normalized this with the size of the community. The sum of these results gives our community dependency value which can be divided into multiple categories. This is explained in the equation below where C is the set of communities, F is the feature set, w1 is the first feature’s weight, and w2 is the second feature’s weight. This resulted in a number between 0 and 3 in each case.

Community Dependency =

 c∈C



w21 + w22 1  size(c ) f ∈F wi



The Community Connections part of Fig. 20 shows how this number of each feature pair compares to the average of the opinions of the two domain experts. The black fields note the cases the community-based result showed lower connection probability level while matching connectivity levels are represented by green fields. The yellow and orange fields represent the cases where the communities suspect a higher chance of connection, the orange is a more significant difference. While the information possessed by domain experts is invaluable in the adoption process it is visible that there is a difference in their opinions which is significant in some cases like the connections of Stock Control and Invoicing or Stock Control and

Supplier order management. Since the domain experts have no ready knowledge on every single program these differences are inevitable. While this assessment only involved the highest level of features which are very general we can imagine how much harder this task could be on a level with tens or even hundreds of features. Furthermore, it is visible that the green fields are in majority which can mean that the community detection output is similar to the average of domain expert views, thus holds real and potentially useful information about the dependencies between features. We note that the average of the votes of domain experts in the figure is based on the rounded down value of average, in case we would round up we get more black fields but also even more green fields. This shows that if domain experts take a more cautious approach the data of the communities can match even better with their views. Our results also fall closer to the answers of the domain experts individually than they do to each other, the absolute difference between them being 24 while the community dependency shows 20 and 22 absolute difference from them. Although we can see that the communities can contain similar information to what the domain experts use we also know that they build on structural information as they are working on the call graph of the system. This means that it contains additional information that highlights a more structural level and while calls inside the system do not occur necessarily in every case, it is still worth to take them into account even for a quick inspection. Answer to RQ2: According to the evaluation the call graph communities represent real feature coupling information that ap-

84

A. Kicsi, V. Csuvik and L. Vidács et al. / The Journal of Systems and Software 155 (2019) 70–90

Fig. 21. The possible ways of usage of the results of various feature extraction techniques in helping product line adoption.

proximates domain expert knowledge well while relying only on structural information, thus can contribute to the decisions of a domain expert by providing another informed viewpoint. 6.3. Discussion In this subsection we overview the possible uses of our methods presented. Fig. 21 highlights how various feature extraction techniques can be used to help in building the new product line architecture. •









Structural extraction - Provides a detailed, widespread analysis. It is good for developers since they are required to have knowledge of all of the programs called by a feature. Conceptual extraction - For domain experts, on the other hand, all called programs can be too much. This approach introduces conceptual dependencies but may contain too much noise for smooth work. Combination (ESS) - Grasps the essence of features, more fit for domain experts. While constructing the new architecture, the domain experts need to judge properly which parts of the variants should be adopted. In this decision making process, the results of this combined extraction highly decrease complexity and it can also facilitate test planning in the future. Community detection - Identifies program communities over the call graph that work closely together while having less outer relations. Community matching - Matches program communities with features in the code. Can be used to find differences between the view of the domain experts and the actual implementation of features.

As our evaluation pointed out, the developers found the programs of our result sets as a source of feature mapping information much more preferable to the other, unrelated programs. Besides these, we would like to mention some other possible ways to use the results. Firstly, connections are not necessarily observable through the calls of the system, programs can, for instance, connect by accessing to the same data objects. This means that not every connection will present itself on the call graph. These, however, can be found via conceptual feature extraction, since it is likely that programs using the same data are conceptually connected to the same feature. This is why the programs detected by the conceptual extraction and not discovered via structural information can still be valuable. This seems to be verified by

the developers also since in our evaluation we found that only information retrieval based results scored even higher in their regard than the call graph based results. However noisy, conceptual data still represents a ready source of the semantically connected programs to each feature, hence it can also be a useful information source. Additionally, both the structural and information retrieval based methods can be tailored according to our intent by filtering out the more general programs of the call graph which provides the possibility to form even better separated program sets and by changing semantic similarity thresholds in the conceptual method. As our evaluation pointed out communities hold valid information about features and similar opinions can be extracted from it as relying on domain expertise while working on a more structural level. Although community detection and matching can aid the understanding of features and the current state of the system they can also be useful in the future at a maintenance stage. Additional possible uses even involve the refinement of the feature model and easily comprehensible tracking of the evolution of programs. Some appropriate setting of communities could also be optimized for a recommendation system of possible splitting or merging of features. We made our results available to our industrial partner, who has already commenced on constructing the new SPL architecture. The work is proceeding well, product line adoption seems to go according to the plans and the results of our experiments are utilized in the process. Our current project required us to remain within the field of 4GL languages, but our future plans involve also implementing our approach in more traditional languages like Java or C++. Since call graphs can be constructed for these languages too and they also tend to contain a lot of natural language text, our method can be adapted to work in more traditional conditions. Magic and 4GL languages, in general, are generally more data intensive, and in our specific advantageous case features were readily connected with menu elements making direct calls inside the software, but a similar menu to class position can also be rather easily achieved in a lot of Java or C++ applications due to the confinements of some frameworks. 7. Related work The literature of reverse engineering 4GL languages is not extensive. By the time the 4GL paradigm arisen, most papers coped

A. Kicsi, V. Csuvik and L. Vidács et al. / The Journal of Systems and Software 155 (2019) 70–90

with the role of those languages in software development, including discussions demonstrating their viability. The paradigm is still successful, although only a few works are published about the automatic analysis and modeling of 4GL or specifically Magic applications. The maintenance of Magic applications is supported by cost estimation and quality analysis methods (Verner and Tate, 1988; Witting and Finnie, 1994; Nagy et al., 2011a). Architectural analysis, reverse engineering and optimization are visible topics in the Magic community (Harrison and Lim, 1998, Ocean Software Solutions, Nagy et al., 2011b; Nagy et al., 2010), and after some years of Magic development migration to object-oriented languages (M2J Software LLC) as well. SPL has a widespread literature, and over the last 8–10 years it has gained even more popularity. All three phases of feature analysis (identification, analysis, and transformation) are tackled by researchers. A recommended mapping study on recent works on feature location can be read in Assunção and Vergilio (2014). In the recent years efficient community detection algorithms have been developed which can cope with very large graphs with millions of nodes and potentially billions of edges (Blondel et al., 2008). Originally, community detection was applied mostly on graphs that represent complex networks (e.g., social, biological, technological) (Fortunato, 2010), and have also been suggested for software engineering problems (Hamilton and Danicic, 2012). Besides applications in testing and dependency analysis, software modularization (Mitchell and Mancoridis, 2006) is also addressed by community algorithms. Although modularization is related to reuse and a natural extension of the approach is to use them for feature analysis purposes, we are not aware of previous work on features and community algorithms in 4GL context. Graphbased methods in general usually yield good results and scale well to large systems. For example, in Xue et al. (2010) Xue et. al. introduce a model differencing technique to detect evolutionary changes to product features. In this process, they only rely on feature sets, while we pointed out that the relationships with communities can also be a valuable information. Software product line extraction is a time-consuming task. To speed up this activity, many semi-automatic approaches has been proposed (Valente et al., 2012; Assunção et al., 2017; Haslinger et al., 2011). Reverse engineering is a popular approach which has recently received an increased attention from the research community. With this technique missing parts can be recovered, feature models can be extracted a set of features, etc. (Valente et al., 2012; Lima et al., 2017). Applying these approaches companies can migrate their system into a software product line. However, changing to a new development process is risky and may have unnecessary costs. The work of Krüger et al. (2016) supports cost estimations for the extractive approaches and provides a basis for further research. Feature models are considered first class artifacts in variability modeling. Haslinger et al. (2011) present an algorithm that reverse engineers a FM for a given SPL from feature sets which describe the characteristics each product variant provides. She et al. (2011) analyze Linux kernel (which is a standard subject in variability analysis) configurations to obtain feature models. LSI has been applied for recovering traceability links between various software artifacts, even in feature extraction experiments (Xue et al., 2012; Al-msie et al., 2012; Eyal-Salman et al., 2013). The work of Marcus and Maletic (2003) is an early paper on applying LSI for this purpose. Eyal-Salman et al. (2012, 2013) use LSI for recovering traceability links between features and source code with about 80% success rate, but experiments are done only for a small set of features of a simple Java program. The main contrast between Eyal-Salman et al. (2013) and our methods is that instead of using family models and test cases to refine LSI results, we used the information of call graphs and

85

communities for this purpose. IR-based solution for feature extraction is combined with structural information in the work of Almsie’deen et al. (2013). Further research deals with constraints in a semi-automatic way (Bagheri et al., 2012) both for functional and even nonfunctional (Siegmund et al., 2012) feature requirements. For an overview of analysis methods in product line research, we refer the interested reader to the survey of Thüm et al. (2014). Substantial research on similar fields already pointed out that features are not independent of each other, and the structure of source code resembles the structure of features (Xue et al., 2010; Klatt and Küster, 2013; Peng et al., 2013). Building on this intuition, metrics can be defined (Peng et al., 2013) based on structural similarity and can be used in a feature location method within an iterative context-aware approach. Besides these metrics, communities also provide relevant information about the structure of program code. In further contrast, our approach takes into account the internal structure of software for determining the relevance of the program elements to the features. Several studies (Ullah et al., 2010; Valincius et al., 2013; Kelly et al., 2011; Paškevicˇ ius et al., 2012) have shown, that variability analysis across different software levels is able to come up with promising results. For example, in Kelly et al. (2011) the authors presented an approach for commonality and variability analysis across multiple software projects at different levels of granularity. They evaluated their work on 19 C/C++ operating systems, while our approach was tested in Magic programming language. Feature levels are also a novelty in variability analysis. Pairwise and t-wise testing are leading methods to product line testing to address variance of features (Perrouin et al., 2010; AlHajjaji et al., 2016), where similarity methods are employed as well (Henard et al., 2014; Al-Hajjaji et al., 2014). A possible future goal could be to make the system dynamically configurable, which is a problem known as Dynamic SPL (Lee et al., 2002; Baresi and Quinton, 2015; Horcas et al., 2014; Gamez et al., 2015; Bashari et al., 2017; Uchôa et al., 2017; Hinchey et al., 2012; Lee, 2006). Today, many application domains demand runtime reconfiguration, for which the two main limitations are handling structural changes dynamically and checking the consistency of the evolved structural variability model during runtime (Capilla et al., 2014). However, as reported in Bencomo et al. (2010), the current research on runtime variability is still heavily based on the decisions made during design time. Coping with uncertainty of dynamic changes also needs to be addressed (Classen et al., 2008). Static methods individually are commonly used for the detection of features (Klatt and Küster, 2013; Kästner et al., 2014; Paškevicˇ ius et al., 2012), but the combination of those is not a common practice in the field. Hybrid approaches (Al-msie et al., 2012) combine two or more types of analysis with the goal of using one type of analysis to compensate for the limitations of another, thus achieving better results. Dynamic analysis (Klatt and Küster, 2013; Olszak and Jørgensen, 2009; Maia et al., 2008) is utilized in some of the related research to analyze execution scenarios by using methods which collect and analyze information about the execution of the artifacts. Search-based strategies (Lohar et al., 2013; Ullah et al., 2010) applies algorithms from optimization field, like genetic algorithms in the localization features. As we can see, product line adoption is a very widespread area, and the evaluation of the research results is often a challenging task. In a traditional programming language environment several approaches were introduced to overcome these obstacles. For example in a similar work (Hill et al., 2007) where structural and lexical information were combined, researchers used the well-known precision and recall metrics to compare their approach against state-of-the-art techniques. This method is similarly used in several works (Wei Zhao et al., 2004; Al-Msie’Deen et al., 2013; Xue et al., 2012; Marcus et al., 2004). Although the evaluation process

86

A. Kicsi, V. Csuvik and L. Vidács et al. / The Journal of Systems and Software 155 (2019) 70–90

is qualitative and seems appropriate, it is not applicable in every case (Al-Msie’Deen et al., 2013; Poshyvanyk et al., 2006; Poshyvanyk and Marcus, 2007) which is even truer for non-traditional programming languages like Magic. Applying existing approaches in 4GL encounters several obstacles, since the entire environment is very different. First of all there is no source code in its traditional sense, and these techniques usually rely heavily on it. The structure of these systems also differs from the conventional projects. Finding an open source project is a very hard task. It is clear that the evaluation in these cases should be unique, and new approaches are needed. Several existing approaches can be adapted to the 4GL environment, although none of the papers we encountered cope with 4GL product line adoption directly. 8. Conclusions An ongoing industrial project was presented, which undergoes a software product line adoption process. In this work, we concentrated on the feature extraction and analysis aspects of the project, which are fundamental parts of the effort because further architecture redesign and implementation are and will be based on this information. For feature extraction we used two approaches: one based on computing structural information in form of call-graphs, and the other extracted from the textual representation of high level feature models and the code. The combination of the two pieces of information had to be performed and processed in such a way that the resulting models are most useful for project par-

ticipants. Experimental results show that the final models are significantly more comprehensible, and hence directly usable (though in various forms) by domain experts, architects and developers. On the other hand, the high level view of domain experts is not necessarily in line with the actual implementation. We introduced community detection methods on the call structure of the system to match communities with feature code originated from domain experts. We evaluated our results based on comparison with human expertise and shown that our various outputs present knowledge that matches well with the thinking of experts but can still highlight additional insights. The proposed method including the associated toolset is currently in use by our industrial partner in this ongoing effort. Although the approach was implemented in Magic, a 4GL technology, we believe that the fundamental method could be suitable for other more traditional paradigms as well after the necessary adaptations. Acknowledgments Ferenc Kocsis was supported in part by the Hungarian National Grant GINOP-2.1.1-15-2015-00370. András Kicsi, Viktor Csuvik, László Vidács, Ferenc Horváth and Tibor Gyimóthy were supported in part by the European Union, co-financed by the European SocialFund (EFOP-3.6.3-VEKOP-16-2017-0 0 0 02). Árpád Beszédes was supported by the János Bolyai Research Scholarship of the Hungarian Academy of Sciences. The Ministry of Human Capacities, Hungary grant 20391-3/2018/FEKUSTRAT is also acknowledged.

A. Kicsi, V. Csuvik and L. Vidács et al. / The Journal of Systems and Software 155 (2019) 70–90

87

Appendix A. Matching of communities with 1st level features Communities and matching high level features are shown in Fig. A1.

Access management Access Access management management Access management Access management Administrator Access management Access management Access management Access management Customer order Administrator Administrator Customer order Administrator interventions Customer order management interventions interventions Customer order Invoicing management interventions Invoicing management Administrator management Invoicing Administrator Administrator interventions Stock control Administrator interventions interventions Administrator interventions control Administrator StockStock control interventions Stock control Administrator interventions Stock control Administrator interventions Stock control Administrator interventions Stock control Administrator interventions Stock control interventions Administrator Stock control interventions Administrator Stock control interventions Administrator Stock control interventions Administrator Stock control interventions Administrator Stock control interventions Administrator Stock control interventions Administrator Stock control interventions Administrator Stock control interventions Administrator Stock control interventions Administrator Administrator interventions interventions Administrator Administrator interventions interventions Administrator Interface interventions Administrator Interface interventions Administrator Stock control interventions Administrator Stock control interventions Administrator Stock control interventions Administrator Invoicing interventions Administrator Invoicing interventions Administrator Administrator interventions interventions Administrator Administrator interventions interventions Administrator Administrator interventions interventions Administrator Administrator interventions interventions Customer order Administrator management interventions Customer order Administrator Access management management interventions Customer order Administrator Administrator Access management management interventions Administrator Administrator Customer order Invoicing interventions management Administrator Customer order Invoicing Interface interventions management Invoicing Administrator Interface Manufacturing interventions Invoicing Administrator Master file Manufacturing interventions maintenance Administrator Master file Access management Quality control interventions maintenance Stock control Administrator Access management Quality Suppliercontrol order interventions Stock control Administrator management Access management Supplier order interventions Administrator Accessmanagement management interventions Administrator Access management interventions Access management Administrator Administrator Access management interventions interventions Administrator Administrator Administrator interventions interventions interventions Administrator Administrator Customer order interventions interventions Administrator Administrator management interventions interventions Invoicing Manufacturing Stock control Administrator Master file interventions Stock control maintenance Master file Quality control maintenance Stock control Stock control Stock control Supplier order Supplier order management management Supplier order management Stock control Invoicing Stock control Master file maintenance Stock control Administrator Quality control interventions Administrator Stock control Administrator Stock control Invoicing interventions interventions Administrator Administrator Administrator Master file interventions interventions interventions maintenance Invoicing Master file Administrator Master file Stock control maintenance maintenance Master file interventions Customer order maintenance Administrator management Administrator Administrator interventions Interface Administrator interventions interventions Administrator Invoicing interventions Administrator interventions Master file interventions Administrator maintenance Stock control Administrator interventions Supplier order interventions Administrator Master file management Master file Supplier order interventions Administrator Administrator Supplier order maintenance Supplier order Administrator Administrator maintenance management Stock control interventions interventions Supplier order Administrator Administrator management Stock control Administrator Master file management interventions interventions management interventions interventions Stock control Invoicing interventions maintenance Invoicing Supplier order Invoicing StockStock control control management

Administrator Administrator interventions Administrator Administrator interventions Stock control interventions interventions Stock control Administrator

Stock control

Quality control

Stock control

Quality control

Stock control Stock control

Quality control

Stock control Administrator interventions Stock control Stock control

Manufacturing Master file maintenance Quality control Stock control Stock control Supplier order management Access management

Stock control Stock control

Stock control

Stock control

Administrator interventions Stock control Customer order management Invoicing Manufacturing Stock control Master file maintenance Stock control Stock control Quality control Stock control Stock control Stock control Supplier order Stock control management

Stock control

Stock control

Stock control Stock control Stock control Stock control Stock control Stock control Stock control

Invoicing Administrator interventions Invoicing Administrator interventions

Administrator interventions Customer order management Stock control Stock control

Invoicing Administrator interventions

Invoicing

Stock control Administrator interventions

Invoicing Invoicing

Stock control

Stock control

Invoicing

Administrator interventions Stock control Administrator interventions Stock control Administrator interventions Stock control Administrator interventions Stock control Administrator Administrator interventions interventions Administrator Customer order interventions Stock control management Customer order Stock control Stock control Stock control management Stock control Stock control

Stock control

Access management Access management Administrator Administrator

Access management Administrator interventions Customer order Access management management Administrator Invoicing interventions Manufacturing Customer order Master file management maintenanceAccess management Invoicing Quality control Administrator Manufacturing Stock control interventions Master file Supplier order Customer order maintenance management management Quality control Access management Invoicing Stock control Administrator Manufacturing Supplier order interventions Master file management Customer order maintenance management Quality control Access management Invoicing Stock control Administrator Manufacturing Supplier order interventions Master file management Customer order maintenance management Quality control Access management Invoicing Stock control Administrator Manufacturing Invoicing Supplier order interventions Master file management Customer order Access managementmaintenance management Administrator Quality control Invoicing Stock control interventions Access management Access management Manufacturing Invoicing Customer order Supplier order Administrator Administrator Master file management interventions interventionsmanagement maintenance Invoicing Customer order Customer order Invoicing Quality control Quality control management managementManufacturing Stock control Master file Stock control Invoicing Invoicing Supplier order Manufacturing Manufacturingmaintenance management Invoicing Master file Master file Quality control maintenance maintenanceStock control Supplier order Quality control Quality control Stock control Stock controlmanagement Supplier order management

Invoicing

Administrator interventions

Stock control

Access management Administrator interventions Customer order

Quality control

Stock control

Invoicing

Stock control

Stock control

Stock control

management Invoicing Access management Manufacturing Administrator Access management Invoicing Master file interventions Administrator maintenance Customer order Quality control interventionsInvoicing Access management Quality control management Stock control Customer order Administrator Stock control Invoicing management interventions Supplier order Manufacturing Invoicing Customer order management Master file Manufacturing management maintenance Master file Invoicing Quality control maintenance Manufacturing Stock control Quality control Master file Supplier order Stock control maintenance Invoicing management Access management Supplier order Quality control Administrator management StockQuality control control Supplierinterventions order Customer order management management Invoicing Invoicing

Quality control Quality control Quality control

Stock controlStock control Stock control Stock control Stock control

Stock control

Customer order management Invoicing Invoicing

interventions Administrator interventions

Stock control Stock control

interventions interventions Customer orderorder Customer management management

Invoicing Invoicing Invoicing Stock control Stock control Stock control Stock control Manufacturing Stock control Manufacturing Stock control StockStock control control Master control file StockStock control Master file file Master Stock control Invoicing Stock control maintenance Stock control maintenance maintenance Stock control StockStock control control Stock control Stock control Quality control Stock control Quality control Stock control Stock control StockStock control control Stock control Stock control Stock control Supplier orderorder Supplier Quality control Stock control management Quality control management Stock control Stock control Stock control Stock control Stock control Stock control Stock control Quality control Stock control Quality control Stock control Invoicing Quality control Invoicing Quality control Invoicing Quality control Invoicing Access management Quality control Administrator Invoicing management Quality Access control interventions Invoicing Administrator Invoicing Customer order Quality control interventions Quality control management Customer Quality control order Stock control Invoicing management Manufacturing Invoicing Master file Manufacturing maintenance Master file Quality control maintenance Stock control Quality Invoicing control Supplier order Stock control management Invoicing Quality control Supplier order Stock control Quality control Stock control Stock control

Customer order management

Access management Access management Access management

Access management

Interface

Access management

Invoicing

Invoicing Access management Administrator interventions Customer order management Access management Invoicing Administrator Manufacturing interventions Master file Customer order maintenance management Invoicing Quality control Manufacturing Stock control Supplier order Master file management maintenance Quality control Stock control Supplier order management

Invoicing

Invoicing

Invoicing

Customer order Access management management Interface Administrator

Interface

Interface Interface

Administrator interventions

Stock control

Quality control Stock control

Invoicing Invoicing Access management Invoicing Administrator Access management interventions Administrator Invoicing Administrator Customer order interventions interventions management Customer order Quality control Invoicing management Stock control Manufacturing Interface Access Master management file Invoicing Administrator maintenance Access management Manufacturing interventions Supplier order Quality Administrator Mastercontrol file Access management Customer order management Stock control interventions maintenance Supplier order Administrator management Supplier order Customer order Quality control management interventions Invoicing Access management management management Stock control Customer order Manufacturing Administrator Interface Supplier order Access management management Master file interventions Invoicing management Administrator Interface maintenance Customer order Manufacturing Access management interventions Invoicing Quality management Mastercontrol file Administrator order Manufacturing AccessCustomer management Stock control Invoicing maintenance interventions management Master file Administrator Supplier order Manufacturing Quality control order AccessCustomer management Invoicing maintenance interventions management Master file Stock control management Administrator Access management Manufacturing Quality Customer ordercontrol maintenance Supplier order Invoicing interventions Administrator Master file Stock control management Quality control management Manufacturing Customer order Accessinterventions management maintenance Supplier order Invoicing Stock control Master file management Customer order Administrator Quality control management Manufacturing Supplier order maintenance Invoicing management interventions Stock control Master file management Quality control Manufacturing Interface Customer order Supplier order maintenance Stock control Master file Invoicing management management Quality control Supplier order maintenance Manufacturing Invoicing Stock control management Quality control Master file Manufacturing Supplier order Stock control Stock maintenance Master filecontrol management Supplier order Supplier order Invoicing Quality control maintenance management Master file management Stock control Quality control maintenance Supplier order Stock control management Supplier order management

Administrator interventions

Administrator interventions

Administrator interventions Administrator interventions

Stock control

Administrator interventions Administrator interventions Administrator interventions Administrator interventions Administrator interventions Administrator interventions Administrator interventions

Administrator interventions

Administrator interventions

Administrator interventions Administrator interventions

Stock control Administrator interventions Interface Administrator Master file interventions maintenance Stock control

Quality control

Access management Administrator interventions Customer order management Invoicing Manufacturing Master file maintenance Quality control Stock control Supplier order management

Administrator interventions

Administrator interventions

Stock control

Administrator interventions

Administrator interventions Administrator interventions

Stock control

Administrator interventions

Administrator interventions

Administrator interventions Administrator interventions

Stock control

Administrator interventions

Stock control

Administrator interventions

Administrator interventions

Administrator interventions

Administrator interventions

Access management Administrator interventions Customer order management Invoicing Customer order Manufacturing

Stock control

Customer order Master file management Customer order management maintenance management Customer order Quality control management Stock control Customer order Supplier order management Customer order management management Customer order management Customer order management

Stock control

Stock control

Customer order management Invoicing Customer order management

Access management Customer order Administrator management interventions Invoicing Customer order Access management management Administrator Invoicing interventions Manufacturing Customer order Master file Customer order management maintenance management Invoicing Quality control Manufacturing Stock control Customer order MasterSupplier file Customer order order management maintenance management management Customer order InvoicingQuality control Customer order Stock control Customer order management Stock control Supplier order managementmanagement management

Customer order management

Access management Administrator Administrator Administrator Administrator interventions interventions interventions interventions Stock control Customer order Administrator Administrator Stock control Stock control management interventions interventions Invoicing Stock control Stock control Administrator Administrator Manufacturing interventions interventions Master file Access management Stock control Stock control maintenance Administrator Quality control Administrator Administrator interventions Stock control interventions interventions order SupplierCustomer order Stock control Stock control management management Invoicing Manufacturing Administrator Administrator Master file interventions interventions Access management maintenance Stock control Stock control Administrator Quality control interventions Stock control Customer order order Administrator Supplier Administrator interventions management management interventions Stock control Invoicing Manufacturing Master file Administrator Administrator maintenance interventions interventions Quality control Stock control Stock control Supplier order management

Administrator interventions Stock control

Administrator interventions

Administrator interventions Stock control

Administrator Administrator interventions interventions Stock control Stock control Administrator Administrator interventions interventions Administrator Administrator Stock control Stock control Administrator interventions interventions interventions Stock control Stock control Stock control

Administrator interventions Administrator interventions

Administrator interventions Administrator interventions Administrator

Administrator interventions Administrator

interventions Administrator Administrator Administrator interventions interventions Administrator interventions interventions interventions

Administrator interventions

Customer order management Customer order management

Stock control

Stock control

Administrator interventions

Administrator interventions

Stock control Stock control Stock control

Invoicing

Quality control

Administrator interventions Access management Administrator Administrator interventions interventions Customer order management Administrator Invoicing interventions Manufacturing Access management Master file Administrator Administrator maintenance interventions interventions Quality control Customer order Stock control management Supplier order Administrator Invoicing management interventions Manufacturing Master file maintenance Administrator Quality control interventions Stock control Supplier order Administrator management interventions

Stock control

Invoicing

Quality control

Administrator interventions

Administrator interventions

Customer order management

Interface

Administrator Administrator Administrator interventions Administrator Administrator interventions interventions interventions interventions Administrator Administrator interventions interventions Administrator Administrator interventions interventions Administrator interventions

Customer order management

interventions Customer order management Interface Invoicing Manufacturing Master file maintenance Quality control InterfaceStock control Supplier order management

Interface

Stock control

Invoicing

Access management Access management Access Administrator management Administrator Administrator interventions Access management interventions interventions Customer order Access Administrator management Customer order Customer order management interventions Administrator management management Quality control Invoicing Customer order interventions Interface Stock Invoicing control Manufacturing management Customer order Manufacturing Invoicing Invoicing Master file Invoicing management Master filecontrol Manufacturing Quality maintenance Manufacturing Invoicing Access management maintenance Master file Stock control Stock control Access management Quality Master file control Manufacturing Administrator Quality control maintenance Stock control Access management Administrator Manufacturing maintenance Master fileStock control interventions Stock control Stock control Quality control Administrator interventions Supplier order Manufacturing maintenance Quality control Access management Stock control Customer order Supplier order Stock control Manufacturing interventions Customer order management Quality control Stock control Stock controlmanagement Administrator management Supplier orde Manufacturing Customer order management Stock control Stock controlorder Supplier Invoicing Manufacturing interventions Invoicing Stock control management Invoicing Manufacturing management Supplier order Manufacturing Customer order Manufacturing Invoicing Manufacturing Invoicing management Manufacturing Master file management Stock control Stock control Manufacturing Master Manufacturing file Stock control Manufacturing maintenance Invoicing Manufacturing Access management StockStock control Access management Access management Manufacturing Master file control maintenance Manufacturing Stock control Stock control Manufacturing StockStock control Stock control Quality control control Manufacturing Administrator Administrator Administrator Access management Access management maintenance Quality control Stock control Master file interventions interventions Administrator Administrator interventions Quality control Stock control Access management Supplier order maintenance Customer order order interventions order Customer interventions Customer Stock control Supplier order Administrator Access management management Quality control management management Customerinterventions Customer order management order Administrator Supplier order management Stock control InvoicingInvoicingInvoicing Access management Access management management management order Customer order Manufacturing Customermanagement orderCustomerinterventions Supplier order Manufacturing Manufacturing Administrator Administrator Invoicing Invoicing management management management Customer order management Master file Access management Access management Master file Master file interventionsCustomer interventions Manufacturing Manufacturing Invoicing order Invoicing Stock control Invoicing management maintenance Administrator Administrator maintenance maintenance Customer order Master file Customer ordermanagement Customer order Master file Manufacturing Invoicing Quality control interventions interventions Quality control maintenance management management maintenance Quality control Master file management Invoicing Manufacturing Stock control CustomerInvoicing order Invoicing Quality control Stock control Customer order Stock control Quality control Invoicing maintenance Master file order order Stock control management management order Supplier Access management Stock control Stock control SupplierSupplier Manufacturing Manufacturing Quality control Access management maintenance management Invoicing management SupplierStock Administrator Master file Master Invoicing file Administrator Supplier order management order control Administrator Quality control Manufacturing Manufacturing interventions interventions maintenance maintenance management management SupplierStock ordercontrol interventions Access Stock control Master file control Master file Customer order Customer ordermanagement Quality Quality control management Customer order Access management Supplier order Administrator maintenance maintenance management management Stock control Stock control management Administrator management interventions QualitySupplier control order Quality Invoicing Invoicing Supplier ordercontrol Invoicing Access management interventions Customer order Stock control Stock control Manufacturing Master file Administrator management management Manufacturing Customer order management Access Supplier order Supplier order Master filemanagement maintenance Master interventions file management Invoicing Administrator management management maintenance Access management maintenance Customer order Invoicing Manufacturing interventions Quality control Administrator Quality control management Manufacturing Master file StockCustomer control fileorder interventions Stock control Invoicing Master Access managementStock control maintenance Access management management Supplier order Customer order SupplierManufacturing order maintenance Administrator Quality management control Administrator Invoicing management management Master file Quality control interventions Stock control interventions Manufacturing Invoicing maintenance Stock controlorder Customer Supplier order Customer order Master file Manufacturing Quality control Access management Supplier order management management management maintenance Master file Stock control Administrator management Invoicing Invoicing Quality control maintenance Supplier order interventions Access management Manufacturing Manufacturing Stock control Quality control Stock control management Customer order Access management Administrator Masterorder file Master file Supplier Stock control management Administrator interventions maintenance Customer maintenance management Supplier orderorder Invoicing interventions Access management Customer order Quality control management Quality control management Manufacturing Customer Administrator management Stockorder control Invoicing Stock control Access management Master file management interventions Invoicing Supplier order Supplier order Administrator maintenance Interface Customer order Manufacturing management management interventions Quality control Invoicing management Master file Customer order Access management Stock control Manufacturing Invoicing maintenance AccessManufacturing management management Administrator Supplier Master fileorder Quality control Administrator Invoicing interventions management maintenance Master file Stock control Access management interventions Manufacturing Customer order Customer order Quality control maintenance Supplier order Administrator Access management Customer order Master management management Stockfile control Quality control management interventions Administrator management Access management maintenance Invoicing Invoicing Supplier order Stock control Customer order Access management interventions Invoicing Administrator Manufacturing Quality control management Access management Supplier order Access management management Customer Administrator order Access management Manufacturing interventions Stock Master file control Administrator management Administrator Access management Invoicing interventions management Master file Customer order Access management Supplier order maintenance Access management interventions Administrator interventions Customer order Access management Administrator Manufacturing Customer order Invoicing maintenance management Administrator Qualitymanagement control Administrator Customer order interventions Customer order management Administrator interventions MasterStock file control management Manufacturing Quality control Invoicing interventions interventions management Customer order management Invoicing interventions Customer order maintenance Master fileInvoicing Access management Stock control Manufacturing Customer order Supplier order Customer order Invoicing management Customer order managementInvoicingMasterQuality control Manufacturing maintenance Administrator Supplier order file management management Manufacturing Invoicingmanagement Manufacturing management Stock control Master file interventions Quality control management maintenance Invoicing Invoicing Invoicing Master file Manufacturing Master Quality file Invoicing Manufacturing Supplier order maintenance Customer order Stock control control Manufacturing Manufacturing maintenance Master file maintenance Manufacturing management Access management Supplier Quality order control management Stock control Master fileMaster fileMaster file Quality control maintenance Quality control Master file maintenance Administrator Invoicing Stock control management Supplier order maintenance maintenance Stock control Quality control Stock control maintenance Quality control interventions Supplier order Manufacturing management Quality control Quality control Supplier order Stock control Supplier order Quality control Stock control Access management Customer orderMaster management file Customer order Stock control Stock control management Supplier order management Stock control Administrator management Supplier order maintenance management Supplier order Supplier order management Supplier interventions Invoicing management Quality controlorder Invoicing management management management Customer orderManufacturing Stock control Customer order management Master fileSupplier order management Invoicing maintenancemanagement Invoicing Manufacturing Quality control Master file Stock control maintenance Supplier order Quality control management Stock control Supplier order management

Invoicing

Stock control

Stock control

Stock control

Invoicing

Stock control Access management Administrator Stock controlmanagement Access interventions Administrator Stock controlmanagement Access Customer order interventions Administrator management Stock control order Customer interventions Invoicing management Customer order Manufacturing Invoicing management Master file Manufacturing Invoicing maintenance Master file Manufacturing Quality control maintenance Master file Stock control Quality control maintenance Supplier order Stock control Quality control management Supplier order Stock control management Supplier order management

Invoicing

Stock control

Stock control

Invoicing

Stock control

Invoicing

Stock control

Stock control

management Invoicing

Stock control

Invoicing

Stock control

Stock control

Stock control

Stock control

Stock control

Stock control Access management Stock control Administrator interventions Stock control Customer order Access management management Stock control Administrator Invoicing Access management interventions Manufacturing Administrator Customer order Invoicing Master file interventions management Quality control maintenance Customer order Invoicing Access management Stock control Quality control management Manufacturing Administrator Stock control Invoicing Master file interventions Supplier order Manufacturing maintenance Customer order Invoicing management Master file Quality control management Quality control maintenance Stock control Invoicing Stock control Quality control Supplier order Manufacturing Stock control management Master file Supplier order Stock control maintenance management Quality control Stock control Stock control Supplier order Stock control management Stock control

Supplier order management

Customer order Customer order Customer order Customer order Customer order Stock Stock Stock control control control Invoicing Customer order Stock control Invoicing Stock control Invoicing Invoicing Customer order Invoicing Customer order Invoicing Invoicing Invoicing Stock control management management Stock control management management Stock control Stock control management Stock control management Stock control management Stock control Stock control management Stock control Stock control Stock control Stock control Stock control Stock control Invoicing Stock control Invoicing Stock control Invoicing Stock control Invoicing Invoicing Invoicing Invoicing Invoicing Invoicing Invoicing Supplier order Invoicing Invoicing Supplier order Invoicing Supplier order Invoicing Invoicing management Invoicing management Invoicing management Invoicing Invoicing Invoicing Invoicing Invoicing Invoicing Invoicing Invoicing Invoicing Invoicing Invoicing Invoicing Invoicing Invoicing Administrator Invoicing Invoicing Administrator Invoicing Administrator interventions Invoicing Administrator interventions Stock control Administrator interventions Customer order Stock control Administrator interventions Customer order Administrator interventions Customer order management Administrator interventions management Customer order interventions Administrator management Customer order interventions management Master file Access management interventions Invoicing management Master file Invoicing maintenance Administrator Access management Invoicing maintenance Stock control Invoicing interventions Stock control Administrator Invoicing Stock control Customer order Invoicing Administrator Stock control Invoicing interventions Stock control management interventions Customer order Stock control Invoicing Invoicing Stock control Invoicing management Invoicing Invoicing Manufacturing Invoicing Invoicing Invoicing Master file Manufacturing Invoicing Invoicing maintenance Master file Invoicing Stock control Stock control Quality controlInvoicing maintenance Stock control Invoicingorder Stock control Supplier Stock control Quality control Supplier order Stock control Supplier order management Stock control Stock control management management Supplier order Invoicing Stock control Invoicing Stock control management Invoicing Stock control Invoicing Administrator Invoicing Administrator Invoicing interventions Administrator Invoicing interventions Administrator Invoicing interventions Invoicing Administrator Invoicing interventions Administrator Supplier order interventions Administrator Invoicing interventions management Supplier order Administrator interventions Supplier order management Administrator interventions Supplier order management interventions Administrator Access management Accessmanagement management Administrator interventions Stock management control Administrator Access interventions Administrator Access management interventions Administrator Access management interventions Administrator Access management interventions Administrator Access management interventions Supplier order Access management interventions Access management Supplier order Access management management Administrator Access management management Interface interventions Access management Interface Access management Customer order Interface Access management management Administrator Interface Access management Invoicing interventions Access management Manufacturing Stock control Customer order Access management Master file Customer order Access management management maintenance Customer order Access management management Quality control Customer order Access management management Stock control Access management management Invoicing Access management Supplier order Invoicing Suppliermanagement order Access management Supplier order Access management management Supplier order Access management management Supplier order Access management management Supplier order Access management management Access management management Invoicing Access management Invoicing Access management Administrator Invoicing Administrator Invoicing interventions Administrator Invoicing interventions Supplier order Administrator interventions Supplier order management interventions Supplier order Stock control management Supplier order Stock control management Supplier order Stock control management Access management Supplier order Stock control management Administrator Supplier order Stock control management interventions Administrator Supplier order Stock control management Customer order interventions Supplier order Stock control management management Customer order management Invoicing Manufacturing management Invoicing Manufacturing Invoicing Invoicing Master file Manufacturing Master file Access management maintenance Manufacturing maintenance Administrator Invoicing Access Manufacturing Stockmanagement control Quality control interventions Invoicing Access management Administrator Manufacturing Stock control Customer order Invoicing Administrator interventions Manufacturing Administrator Supplier order Access management management interventions Customer order Invoicing interventions Manufacturing management Access management Administrator Invoicing Customer order management Master file Manufacturing Administrator Access management interventions Manufacturing Access management management Interface Administrator Manufacturing maintenance Administrator interventions Customer order Master file Administrator Interface Invoicing interventions Manufacturing Stock control Access management interventions Customer order management Administrator maintenance Access management interventions Manufacturing Invoicing Manufacturing Administrator Customer order management Invoicing interventions Quality control Administrator Customer Administrator Manufacturing Master order file interventions management Invoicing Manufacturing Customer order Stock control Access management interventions management interventions Master file maintenance Customer Invoicing Manufacturing Master file management Supplier order Administrator Customer order Interface Invoicing Invoicing maintenance Quality control management Manufacturing Master file maintenance management interventions management Invoicing Invoicing Quality control Stock control Invoicing Master file maintenance Quality control Customer order Interface Manufacturing Invoicing Stock control Supplier order Manufacturing maintenance Quality control Stock control Access management management Invoicing Invoicing Master file Supplier order management Master file Quality control Stock control Supplier order Access management Administrator Interface Invoicing Master file Manufacturing maintenance management maintenance Supplier order Stock control management Administrator interventions Invoicing Invoicing maintenance Master file Quality control Quality control Supplier order management interventions Customer order Master file Manufacturing Invoicing maintenance Stock control Stock control management Customer management Master file Invoicing maintenance Masterorder file Quality control Supplier order Supplier order management Interface Invoicing Master file maintenance maintenance Stock control management management Interface Invoicing maintenance Invoicing Quality control Supplier order Invoicing Invoicing Manufacturing Stock control management AccessAdministrator management interventions Customer order Master Invoicing Manufacturing Masterfile file Supplier order Administrator Invoicing management maintenance Master file maintenance management interventions Master file Invoicing Master file Invoicing maintenance Quality control Customer order Supplier order Master maintenance maintenance Quality control Stock file control Accessmanagement management Supplier order management Master file maintenance Stock Stockcontrol control Supplier order Administrator Invoicing Supplier order management maintenance Supplier order management interventions Supplier order Manufacturing management Access management management Customer order Supplier order management Master file Administrator management Supplier order management maintenance interventions Invoicing Supplier order Stock control management Quality control Administrator Customer order Manufacturing management Stock control Administrator interventions management Stock control Master fileorder Supplier Administrator interventions Stock control Invoicing Administrator Stock control maintenance management Administrator interventions Stock control Administrator Administrator Manufacturing interventions Quality control interventions Invoicing interventions interventions Master file Stockfile control Invoicing Master Administrator Supplier order maintenance Supplier Master file order maintenance Administrator Stock control interventions management Quality control management maintenance Stock control Administrator Stock control interventions Stock control Stock control Administrator Stock control interventions Supplier Master file order Administrator interventions Administrator Master file management maintenance interventions Master file interventions maintenance Administrator Master file Stock control maintenance Stock control Master file interventions maintenance Access management Stockcontrol control maintenance Stock Stock control Administrator Administrator Stock control interventions interventions Stock control Administrator Stockcontrol control Customer order Stock Administrator Stock control interventions management Stock control interventions Invoicing Quality control Stock control Manufacturing Stock control Access management Quality control Master file Stock control Administrator Quality control maintenance Interface Quality control interventions Interface Quality control Quality control Customer order Interface Stock control Quality control Access management Interface management Quality control Supplier order Interface Administrator Invoicing Quality control management Interface interventions Supplier order Manufacturing Invoicing Access management Supplier order Quality control Access management Customer order management Master file Quality control Administrator management Administrator management Invoicing maintenance Supplier order interventions Quality control Invoicing interventions Invoicing Quality Master file management Quality controlcontrol Customer order Master file Customer order Stock control Manufacturing maintenance management Master file Quality control maintenance management Master file Supplier order Invoicing Quality control maintenance Invoicing Invoicing maintenance management Manufacturing Invoicing Master file Manufacturing Quality Master file control Master file maintenance Supplier order Master file control Stock maintenance Supplier order maintenance Supplier orderSupplier order management maintenance Supplier order Quality control management Supplier order management Supplier order Quality management management Stockcontrol control management Supplier order Stock control management Stock control Supplier order Stock control Supplier order management Supplier order Stock controlSupplier order management Access management management Supplier order Stock control management management Administrator Stock control management Stock control Stock control interventions Stock Stock control control Stock control Customer order Stock control Stock control Administrator Stock control management Administrator Stock control interventions Master file Stock control Invoicing interventions Master file Stock control maintenance Manufacturing maintenance Stock control Stock control Stock control StockMaster controlfile Stock control Stock control Stock control maintenance Stock control Stock control Stock control Stock control Stock controlQuality control Access management Stock control Stock control Stock control Stock control Stock control Administrator Access management Stock control Stock control Supplier order Stock control Stock control interventionsAdministrator Stock control Stock control Stock control management Stock control Stock control Customer order interventions Stock control Stock control Stock control Stock controlmanagement Stock control Stock control order Customer order Stock control Supplier Stock control Invoicing management Stock control Stock control Invoicing management Stock control Access management Manufacturing Stock control Interface Master file Invoicing Stock control Master filefile Administrator Administrator Stock control Master file Master Administrator Invoicing maintenance Supplier order maintenance interventions Supplier order Stock control interventions maintenance maintenance interventions Supplier order Stock control Manufacturing Supplier order Stock control Administrator management Supplier order Stock control Administrator management Customer order Supplier order Quality control Master management filemanagement Administrator interventions Stock control Stock control interventions management management Stock control Stock control maintenance management Stock control interventions Stock control Invoicing Supplier order Stock control Invoicing Supplier order Stock control Invoicing Supplier order Invoicing Stock control Supplier order Supplier order control Supplier order Stock control Supplier order Quality control Supplier order Stock control Invoicing Stock Supplier order Supplier order Stock control Supplier order Supplier order Supplier order Stock control management Stock control Stock control management Stock control management Stock control Manufacturing Stock control management Stock control Stock Stock control control managementStock control management management management management management management management management Master file Supplier order maintenance management Quality control Stock control Supplier order Stock control Stock control management

Administrator interventions

Invoicing

Stock control Supplier order management

Supplier order management

Stock control

Access management

Invoicing

Supplier order management

Stock control

Access management

Supplier order management Supplier order management

Stock control

Administrator interventions

Invoicing

Access management Stock control

Administrator interventions

Invoicing

Supplier order management Supplier order management

Access management

Access management Administrator Access management interventions Administrator Customer order interventions management Customer order Invoicing management Manufacturing Invoicing Master file Manufacturing maintenance Master file Quality control maintenance Stock control Quality control Supplier order Supplier orderStock control management managementSupplier order management

Quality control Quality control Quality control

Quality control

Stock control Supplier order management

Administrator Supplier order interventions management

Supplier order management

Administrator interventions

Invoicing

Invoicing

Access management

Stock control

Access management

Invoicing Supplier order management

Stock control

Stock control

Stock control

Stock control

Stock control

Supplier order management

Supplier order management

Supplier order management

Stock control

Supplier order management

Stock control

Stock control Stock control Stock control Stock control Stock control Stock control Stock control Stock control

Stock control Access management Administrator interventions Stock control Customer order management Invoicing Manufacturing Supplier order Master file management maintenance Quality control Quality control Stock control Stock control Supplier order management Supplier order management Stock control

Stock control

Stock control

Stock control

Stock control

Access management Administrator interventions Customer order management Interface Invoicing Manufacturing Master file maintenance Quality control Stock control Supplier order management

Supplier order Supplier order managementSupplier orderSupplier management Access management order Administrator Access management management management interventions Administrator Customer order interventions management Customer order Access management Invoicing management Administrator Manufacturing Invoicing interventions Supplier order Master file Manufacturing Customer order management maintenance Master file management Quality control maintenance Invoicing Supplier order Stock control Quality control Manufacturing Supplier order Stock controlmanagement Master file management Supplier order maintenance management Quality control Stock control Supplier order Supplier order Supplier order management management management

Stock control

Stock control Stock control

Stock control Customer order management

Access management Administrator interventions Customer order management Interface Invoicing Manufacturing Master file maintenance Quality control Stock control Supplier order management

Customer order management

Access management Administrator interventions Customer order management Invoicing Manufacturing Master file maintenance Quality control Stock control Supplier order management

Customer order Access management management Administrator interventions Customer order management Interface Invoicing Customer order Master file management Customer order maintenance Customer order management Supplier order Customer order management management management

Stock control

Stock control

Supplier order management

Administrator interventions

Administrator interventions

Access management Administrator interventions Customer order management Interface Invoicing Manufacturing Master file maintenance Quality control Stock control Supplier order management

Access management Administrator interventions Customer order management Invoicing Manufacturing Master file maintenance Quality control Stock control Supplier order management

Administrator interventions

Administrator interventions

Administrator interventions Supplier order management

Administrator interventions

Administrator interventions

Administrator interventions

Invoicing

Stock control

Stock control

Invoicing

Administrator interventions

Administrator interventions

Administrator interventions

Invoicing

Stock control Stock control

Administrator interventions

Supplier order management

Stock control

Stock control

Supplier order management

Administrator interventions

Customer order management

Invoicing

Administrator interventions

Customer order Administrator management interventions Invoicing

Invoicing

Administrator interventions

Invoicing

Supplier order management

Supplier order management

Stock control

Administrator interventions

Invoicing

Stock control

Stock control

Supplier order management

Quality control

Supplier order management

Customer order management Administrator interventions Customer order management Invoicing Master file maintenance Administrator Stock control interventions Customer order management Stock control

Customer order management

Access management Administrator Supplier order interventions management Customer order management Interface Invoicing Manufacturing Master file maintenance Quality Supplier ordercontrol Stock control management Supplier order management Supplier order management

Supplier order management

Quality control Quality control

Customer order management Stock control

Supplier order management

Supplier order management

Stock control

Stock control

Supplier order management

Supplier order management

Supplier order management

Supplier order management

Customer order Customer order management Customer order management management

Stock control

Supplier order management

Supplier order management

Stock control

Stock control

Stock control

Supplier orderSupplier order

Supplier order management

Stock control

Stock control

Stock control Access management

Supplier ordermanagement managementSupplier order management management

Stock control

Stock control

Stock control

Supplier order management Supplier order management

Access management Invoicing

Stock control Stock control Stock control

Invoicing

Quality control

Quality control

Access management Stock control

Supplier order management

Administrator interventions

Administrator interventions

Supplier order management Stock control

Administrator interventions Administrator interventions

Administrator interventions

Supplier order Supplier order management management

Stock control

Supplier order management

Stock control

Supplier order management

Invoicing

Stock control

Supplier order Supplier order management Supplier order management management

Stock control

Administrator interventions

Administrator interventions

Administrator interventions

Invoicing

Access management Administrator interventions Customer order management Invoicing Manufacturing Master file maintenance Quality control Stock control Supplier order management

Stock control Stock control

Stock control Stock control

Invoicing Master file Administrator maintenance interventions Stock control Invoicing Master file Administrator maintenance interventions Stock control Stock control Invoicing

Stock control Stock control Master file Stock control maintenance Stock control

Access management Administrator interventions Customer order management Invoicing Stock control Manufacturing Master file maintenance Quality control Stock control Supplier order management

Administrator interventions

Master file Administrator maintenance interventions Stock control

Administrator Supplier order interventions management

Supplier order Administrator management interventions

Administrator Administrator interventions interventions

Administrator interventions

Administrator interventions

F g A1 Ma ch ng commun e w h h gh eve ea u e Each node ep e en a p og am o he y em wh e co o ma k ca e o mo e han one ea u e a gned The edge a e he ca he p og am make wh e he c cu a g oup ep e en he a he e e ence o co o n h figu e egend he eade e e ed o he web ve on o h a c e

ea u e affi a on See he node ex o gned commun e Fo n e p e a on o

88

A. Kicsi, V. Csuvik and L. Vidács et al. / The Journal of Systems and Software 155 (2019) 70–90

Appendix B. Matching of communities with 2nd level features Communities and matching second level features are shown in Fig. B1.

1

47

47 26

6 47 47 32

6

6

6

6

6

6

6

6

40

40

25

25 30

14 18 27

27

11

32 32 32

17 40

26

39 47

27 27 27

18 18 26 18

27

39

39

17 27

11

18

27

39

17

18

25

18

25

18

25

11

18

39

17

18

27

18

25

14 27 47

14

25

14

25

39 27

14

25

39

11

14 29 32

25

14

39

14

11

39 27

32

13 13 13 40 40 31 32 32 32 47 47 47 43 43 29 29 29 29 29 29 29 29 29 29 29 16 16 16 16 34 34 14 14 25 25 10 32 42 45 45

32 32 44 44

4 5 25 25 27 32 29 32 36 40 46 48

42

31 32 21 32 32 32 46 21 46 32

21 32

32 36 42

21

25

13 32

13 32

32 32 45 42 42

27 14 27 14 27 14 27 14 27 47

14 17 27 47

14 14

14

14

34

34 34 34

3838

18

18

18

8

8 18

8

8

8

18 18

8

18 18

8

18 18

8

18 14 18 26

18

18 18

18

18 18

18

18 18

18

18 18 18

39

39

39

39

39

18 18

1 2 3 6 7 8 9 10 11 12 39 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

1 39 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

39

18

18

18

6

6

4 5

39 4

4

4

4 4

25

25

1 2 3 4 5 6 7 8 9 10 11 12 6 13 14 15 16 17 18 19 4 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

6

6

4

25 25

25

25

25

25

25

25

25

25

25

25

25

25

25

25

25

25

25

25

25

25

25 25

25 25

25

25

1 2 3 68 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

8

25 25

38

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

8 10 12 19 38 41

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

14 40 48

1 2 3 6 7 8 1 9 2 10 3 11 6 12 7 1 8 13 2 9 14 3 10 15 6 11 16 7 12 18 8 13 19 9 14 20 1015 21 1116 22 1218 23 1319 24 1420 25 1521 26 1622 27 1823 28 1924 29 2025 31 2126 32 2227 33 2328 34 2429 35 2531 36 2632 37 2733 38 2834 39 2935 40 31 4136 3237 42 3338 43 3439 44 3540 45 3641 46 3742 47 3843 48 3944 49 4045 4146 4247 4348 4449 45 46 47 48 49

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

40 48

40 48

16

10

16

8

10

10

10

10

10

10

10

10

10

10

10

41 41 41

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

8 41 12 1 41 2 3 6 1 7 2 18 3 29 6 10 3 7 8 11 6 81 12 12 7 92 814 13 8 10 3 12 14 9 11 6 19 15 10 12 7 16 11 13 8 19 18 12 14 9 19 19 13 15 10 20 14 16 11 2119 15 18 12 22 16 19 1319 23 18 20 14 24 19 19 21 15 25 20 22 16 26 21 8 23 18 27 22 19 24 19 28 23 38 25 20 29 24 41 26 21 31 25 8 27 22 1 32 26 10 28 23 2 33 27 1 12 29 24 3 34 28 2 19 31 25 1 6 35 3 29 38 32 26 2 7 36 31 6 41 33 27 3 8 37 32 7 10 34 28 6 9 38 33 8 35 29 10 7 3910 34 9 36 31 11 8 40 35 10 37 32 12 9 4110 36 11 38 33 13 10 42 37 12 39 34 10 14 11 43 38 13 40 35 15 12 44 10 39 14 41 36 16 13 45 40 15 42 37 18 14 10 46 41 16 43 38 15 19 47 42 18 10 44 39 20 16 48 43 19 45 40 21 18 49 10 44 20 46 41 22 19 45 21 47 10 23 20 42 22 46 48 43 24 21 10 23 47 49 44 25 22 48 24 45 26 23 49 25 46 27 24 26 47 28 25 27 48 29 26 28 49 31 27 29 32 28 31 33 29 32 34 31 1 1 33 35 32 2 2 34 1 36 33 3 3 35 2 37 34 6 6 36 3 38 35 7 7 37 6 39 36 8 8 38 7 40 37 9 9 39 8 41 38 10 10 40 9 42 39 11 11 41 43 40 10 12 12 42 44 41 11 13 13 43 45 42 12 14 14 44 43 13 46 15 15 45 47 44 14 16 16 46 48 45 15 18 18 47 49 46 16 19 19 48 18 47 20 20 49 48 19 21 21 49 20 22 22 21 23 23 22 24 24 23 25 25 24 26 40 26 25 27 48 27 26 28 28 27 29 29 28 31 31 29 32 32 31 33 33 32 34 34 33 35 35 34 36 36 35 37 37 36 38 38 37 39 39 38 40 40 39 41 41 40 42 42 41 43 43 42 44 44 43 45 45 44 46 46 45 47 47 46 48 48 47 49 49 48 49

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 1 23 2 1 24 8 8 3 2 251 10 6 315 26 12 2 7 6 273 15 8 7 286 19 9 8 297 38 10 9 318 41 11 10 32 9 12 11 33 10 13 12 34 11 14 13 35 12 15 14 36 13 16 15 37 14 18 16 38 15 19 18 39 16 20 19 40 18 21 20 41 19 22 21 42 20 23 22 43 21 24 23 44 22 25 24 45 23 26 25 46 24 27 26 47 25 28 27 48 26 29 28 49 27 31 29 28 32 31 29 33 32 31 34 33 32 35 34 33 36 35 34 37 36 35 38 37 36 39 38 37 40 39 38 41 40 39 42 41 40 43 42 41 44 43 42 45 44 43 46 45 44 47 46 45 48 47 46 49 48 47 49 48 49 1 2 3 1 6 2 7 3 8 6 9 7 10 8 11 9 12 10 13 11 14 12 15 13 16 14 18 15 19 16 20 18 21 19 22 20 23 21 24 22 25 23 26 24 27 40 25 28 48 26 29 27 31 28 32 29 33 31 34 32 35 33 36 34 37 35 38 36 39 37 40 38 41 39 42 26 40 43 41 44 42 45 43 46 44 47 45 48 46 26 49 47 48 49

26

26

16 4 16 23 27 29 32 46

10

8 10 8 10 10

10 10 10 10 41 41

26

26

1 15 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

40 48

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 15 22 15 23 15 1 15 24 2 15 15 25 3 26 1 6 27 2 7 28 3 8 1 29 6 9 2 31 7 10 3 32 8 11 6 1 33 9 12 7 2 34 10 13 8 3 35 11 1 14 9 6 36 12 2 15 10 7 37 13 3 16 11 8 38 14 6 18 12 9 39 15 7 19 13 10 40 16 8 20 14 11 1 41 18 9 21 15 12 2 42 32 22 19 16 1310 3 43 40 23 20 18 1411 16 44 46 24 21 19 1512 27 45 47 25 22 20 1613 38 46 48 26 23 21 1814 69 47 24 15 27 22 19 10 25 167 48 28 23 20 11 8 49 26 1 189 29 24 21 12 27 2 19 31 25 22 10 13 28 3 20 32 26 23 11 14 29 6 21 33 27 24 12 15 31 7 22 34 28 25 13 16 32 8 23 35 1 29 26 14 18 33 9 24 36 2 31 27 15 19 34 1025 37 3 32 28 16 20 1 35 1126 38 6 33 29 18 21 2 36 1227 39 7 34 31 19 22 3 37 3228 1 40 40 8 35 1329 20 23 6 38 33 48 2 41 9 36 1431 21 24 7 39 34 3 42 10 37 1532 22 25 8 40 35 6 43 11 38 1633 23 26 9 41 36 7 44 12 39 1834 24 27 10 42 37 8 45 13 40 1935 25 28 11 43 38 9 46 14 41 2036 26 29 12 44 39 10 47 15 42 2137 27 31 13 45 40 11 48 16 43 2238 28 32 14 46 4140 12 49 18 44 2339 29 33 47 13 15 19 45 42 2448 31 34 48 14 16 20 2540 46 43 32 35 49 15 18 21 47 44 2641 33 36 16 19 22 48 45 2742 34 37 43 20 18 28 35 49 46 23 40 38 19 21 24 48 47 2944 36 39 48 20 22 25 3145 37 40 49 21 23 26 3246 38 41 22 24 27 3347 39 42 23 25 28 3448 40 43 24 26 29 3549 41 44 25 27 31 36 42 45 26 28 32 37 43 46 27 29 33 38 44 47 28 31 34 39 45 48 32 29 40 46 35 49 31 33 36 41 47 32 34 37 42 48 33 35 38 43 49 34 36 39 44 35 37 40 45 36 38 41 46 37 39 42 47 38 40 43 48 39 41 44 49 40 42 45 41 43 46 25 42 44 47 43 45 48 44 46 49 47 45 46 48 47 49

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 15 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

18

18

18

18

18

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 18 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

10

10 10 38 40 46

19 19 41

26

16

19 19 19

18

2

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

48 25 49

2

8

26

2

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

1 10 41

2

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

1

1

36 36 36 36 34

36

36 16

34

36

39

16

6

36 16

26

39 39

26

1 2 3 636

36

6

16

39

6

16

7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

36

22

22

36

22

22

22

22

22

36

50

22

36

22

36

22 22 22

22

14

14

14

14

14

14 14 14 14 14 14 14

14

14

1 2 3 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 8 29 10 31 14 32 15 33 34 14 35 36 37 10 38 14 39

14

14

14 14 14

14 14 14 14

14 17 24 38 46

14 17

14

36

36

22

22

36

36

22

22

36

36

22

22

36

36 22

14 17 24 38 46

10 14 38 46

40 41 14 42 18 43 44 45 46 47 48 49

36 36

34

34

34

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 34 48 49

36

36

34

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

36

1 2 3 6 7 8 9 10 11 12 13 14 1536 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

36

1 2 3 4 5 6 7 8 9 10 13 11

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 36 16 17 18 19 2036

1 13 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

21 22 23 24 25 26 27 28 36 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 3447 48 49

18

47

47

13

14 24 38 46 47

47

14 27 47 47 47 47

47 47 47

4 6 29 32 34 39 44 47

47

18

18

18

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 34 49

34

34 34

47 47

47

47

47

47

18

18

47

39

18

37

24

24

26

26

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 1 1 7 2 2 7 1 3 3 28 2 6 28 4 1 3 7 28 5 11 2 4 8 28 6 22 3 5 9 11 1 7 32 3 6 11 10 8 143 6 7 11 111 9 256 7 8 11 1212 10 367 8 9 112313 11 678 9 14 113410 12 5 789 10 11 24 11 415 13 6 8 9 10 11 12 16 140 514 7 9 10 11 12 13 1248 618 15 14 8 10 11 12 13 14 40 2338 719 16 9 11 12 13 14 15 3440 820 17 10 12 13 14 15 16 21 4546 940 18 11 13 14 15 16 18 22 40 5610 19 12 14 15 16 18 19 23 40 6711 20 14 13 15 16 18 19 20 24 7812 21 24 14 16 17 19 20 21 25 22 38 1 8913 15 18 20 21 22 26 914 23 46 2 10 16 19 21 22 23 24 27 1115 24 3 10 18 20 22 23 24 28 1216 25 6 11 19 21 23 24 25 29 1317 1 7 12 26 20 22 24 25 26 31 1418 2 8 13 27 21 23 25 26 27 32 1519 3 9 14 28 22 24 26 27 28 33 1 6 10 15 1620 29 23 25 27 28 29 34 2 7 11 16 1721 31 24 26 28 29 31 35 3 8 12 17 1822 32 25 27 29 31 32 36 6 9 13 18 1923 33 44 26 28 31 32 33 37 7 10 14 19 2024 34 27 29 32 33 34 38 8 11 15 20 2125 35 44 28 31 30 33 34 35 39 9 12 16 21 2226 36 44 29 32 31 34 35 36 40 1013 18 22 2327 37 46 31 33 32 35 36 37 41 1114 19 23 2428 38 34 33 36 37 38 42 1215 20 24 2532 39 23 29 35 34 37 38 39 43 1316 21 25 2633 40 25 4631 36 35 38 39 40 44 1418 22 26 2734 41 38 32 37 36 39 40 41 45 1519 23 27 2835 42 45 39 33 38 37 40 41 42 46 1620 24 28 2936 43 45 46 34 37 39 38 41 42 43 47 1821 25 29 44 45 3135 38 40 39 42 43 44 48 1922 26 31 3236 1 45 39 41 40 43 44 45 49 2023 27 32 3337 2 46 40 42 41 44 45 46 2124 28 33 3438 3 47 41 43 42 45 46 47 2225 3539 1429 34 6 48 42 44 43 46 47 48 2326 31 35 3640 7 49 43 45 44 47 48 49 2414 3741 27 32 36 8 44 46 45 48 49 2528 33 37 3842 9 1414 45 47 46 49 1 26 3943 10 14 24 29 34 38 46 48 47 2 27 4044 11 24 38 31 35 39 47 49 48 3 4145 12 382832 36 40 48 49 1446 6 2933 37 41 4246 13 46 49 14 31 1 7 42 4347 34 14 14 32 38 2 8 4448 35 39 43 15 46 3 9 1 3336 40 44 4549 16 46 12 6 10 3437 41 45 46 18 46 23 7 11 3538 42 46 47 19 46 36 8 12 3639 43 47 48 20 2046 67 9 13 14 3740 44 48 49 21 23 78 10 14 20 38 49 41 45 22 89 11 15 23 3942 46 23 31 910 12 16 4043 47 24 29 1011 13 18 4144 48 25831 1112 14 19 4245 49 26 1213 15 20 4346 827 1314 16 21 8 28 4447 1415 18 8 22 1 4548 29 8 1516 19 23 2 4649 31 8 1618 20 24 3 47 8 32 1819 21 25 8 6 48 33 1920 22 26 7 49 34 2021 23 8 827 8 35 2122 24 36 28 9 8 36 2223 25 8 38 29 10 37 2324 26 31 11 38 2425 827 32 12 39 8 2526 28 33 13 40 2627 29 34 14 41 27 31 35 15 36 28 42 36 36 2829 32 16 43 36 2931 33 37 18 44 16 1 3132 34 38 19 16 45 2 16 3233 35 39 20 46 16 3 3334 36 40 21 16 47 6 16 3435 37 41 22 16 48 7 3536 38 42 16 23 49 16 8 3637 39 43 24 16 9 3738 40 44 25 16 16 10 3839 41 45 26 16 11 39 42 46 40 16 27 16 12 4041 43 47 28 16 13 16 4142 44 48 29 16 14 16 4243 45 49 31 16 15 16 4344 46 32 16 16 16 4445 47 33 16 16 17 4546 48 34 16 16 18 4647 49 16 35 16 19 4748 36 1616 20 1616 4849 37 21 26 2616 49 38 16 16 22 16 39 16 26 26 23 40 23 23 29 23 23 24 41 23 23 23 38 38 38 38 23 23 36 36 23 23 23 21 21 21 21 2125 21 23 23 23 36 36 36 36 36 36 36 36 36 36 42 26 43 27 44 28 45 29 46 31 47 32 48 33 49 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

25

23 25 23 25

23 25 23 25

23 25 23 25

25

31

24

38

24

24

39 47

25

23 25

31

24

47

39 47

23 25

25

24

2 1 3 2 6 3 7 6 8 7 9 8 10 9 11 10 12 11 13 12 14 13 15 14 16 15 29 29 18 16 29 2927 19 18 27 29 20 19 293838 21 20 43 3838 22 21 43 3838 23 22 43 3829 24 23 3838 25 24 38 263825 274026 28 2738 38 29 28 38 38 31 29 37 32 31 37 42 33 32 42 34 33 42 35 34 4225 36 35 3226 37 36 26 38 37 26 26 39 38 26 40 39 26 41 40 25 42 41 25 76 43 42 7 44 43 147 45 44 7 46 45 7 7 47 46 7 48 47 7 49 48 7 49 7

23 25

31

24

3939

23 25

23 25

31

24

39

18

3939

23 25

23 25

34

18

4040

23 25

23 25

26

24

23 25

23 25 23 25 23 25 23 25

26

24

1 2 3 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

14 38 40 47 48

47

47

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

1 2 3 6 7 47 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

47

47

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 1 20 2 21 3 22 6 23 7 24 8 25 9 26 10 27 11 28 12 29 13 31 14 32 15 33 16 34 18 35 19 36 20 37 21 38 22 39 23 40 24 41 25 42 26 43 27 44 28 45 29 46 31 47 32 48 33 49 34

31

39 40 48

4040

47

47

36

38 3840

47

47

36

36

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

18

36

36

3838

18

18

35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 34 35 36 37 38 39 34 40 41 42 43 44 45 46 47 48 49

23 232323

18

18

31

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

18

18

34

34

37

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

23 23 23

18

18

24

39

11

32

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

39

18

34

32

1

39

18

32

10 45

1 1

49 23 23

18

34

34

34

47

47 47

47

34

36

6 16

22

48 49

31

16

18

18

31

16

34

1 10 19 41

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 31 45 46 47 48 49

16

6

18

18

31

25

6

18

18

31

16

6

6

39

18

8

25

16

34

18 18

18

18

1 2 1 3 2 6 3 7 5 8 6 9 7 10 8 11 9 12 10 13 11 14 12 15 13 16 14 18 15 19 16 20 18 21 19 22 20 23 23 21 23 23 24 22 23 23 25 23 36 38 26 38 24 27 46 46 25 46 28 26 37 37 37 37 37 29 27 37 31 28 32 29 33 31 34 32 35 33 36 34 37 35 38 36 39 37 40 38 41 39 42 40 43 41 44 42 45 43 46 44 47 45 48 46 49 47

8

31

26

6 39

26

8 15

26

16

34

26

1 2 1 3 2 6 3 7 6 8 7 9 8 10 9 11 10 12 11 13 12 14 13 15 14 16 15 18 16 19 8 18 20 8 19 21 20 8 22 21 8 2322 24 8 23 25 24 26 25 27 26 28 27 29 28 31 1 29 41 32 31 12 33 41 32 23 34 33 36 41 35 34 47 36 41 35 68 37 36 79 41 1 10 38 1 8 37 39 29 38 412 11 40 12 3310 39 41 141 13 4611 40 42 257 14 41 12 43 813 1 3641 15 42 44 2 47914 16 43 41 1 45 3 681018 44 2 7 15 46 19 45 1 3 6 91141 16 47 8 1220 46 2 7 10 18 48 1 6 8 911 41 21 47 3 7 1319 49 101422 48 1 1 2 6 9 12 8 41 8 111520 23 49 2 2 3 7 1013 21 10 12 6 911 1622 24 14 14 3 3 7 810 13 1823 1215 25 15 1 4 6 8 911 14 1924 1316 26 25 2 5 7 91012 15 2025 1417 27 3 6 8 101113 16 2126 1518 28 6 7 9 111214 18 2227 1619 29 7 8 10121315 19 35 2328 1820 31 8 9 11131416 20 24 1921 32 35 29 9 1012141518 21 2531 2022 33 101113151619 22 2632 2123 34 111214161820 23 2733 2224 35 121315181921 24 2834 2325 36 131416192022 25 29 24 37 26 141518202123 26 35 3136 2527 1 1 38 151619212224 27 3237 2628 2 2 39 161720222325 28 3338 2729 3 3 40 181821232426 29 1 3439 2831 4 6 41 191922242527 31 1 2 3540 2932 5 7 42 202023252628 32 2 3 3641 3133 6 8 43 212124262729 33 3 6 3742 3234 7 9 44 222225272831 34 6 7 3843 3335 8 10 45 232326282932 35 7 8 3944 3436 9 11 46 242427293133 36 8 9 4045 3537 10 12 47 252528313234 37 9 10 4146 3638 11 13 48 262629323335 38 10 11 4247 10 3739 12 14 49 272731333436 39 11 12 4348 14 3840 13 15 282832343537 40 12 13 38 36 4449 3941 14 16 292933353638 41 13 14 46 45 17 40 15 42 313134363739 42 14 15 43 46 4143 16 18 323235373840 43 15 16 38 47 4244 17 19 333336383941 44 16 18 43 48 4345 18 20 343437394042 45 43 18 19 49 4446 19 21 353538404143 46 19 20 43 4547 20 22 363639414244 47 20 21 4648 21 23 43 373740424345 48 21 22 4749 22 24 1 44 41 3838 43 46 49 22 23 48 23 25 393942444547 23 24 49 24 26 404043454648 24 25 25 27 414144464749 25 26 26 28 4242454748 26 27 27 29 4343464849 1 27 28 28 31 44444749 28 29 1 29 32 454548 29 31 2 31 33 464649 31 32 3 32 34 4747 32 33 6 33 35 4848 33 34 1 36 7 34 4949 34 35 2 37 8 35 35 36 3 38 9 36 36 37 6 39 10 37 1 37 38 7 40 11 38 2 38 39 8 41 12 39 3 39 40 9 42 13 40 6 40 41 10 43 14 41 71 41 42 11 44 2 15 42 8 18 42 43 12 45 16 43 93 43 44 13 46 6 18 44 10 17 44 45 14 47 19 45 11 28 45 46 15 48 20 461 12 3 46 47 16 49 21 472 139 47 48 18 610 22 483 14 48 49 19 711 18 4 15 23 49 20 812 49 5 16 24 21 913 6 18 25 1 22 10 14 7 19 26 2 23 11 15 8 20 27 3 24 12 16 9 21 28 6 25 1 13 18 10 22 29 7 26 2 14 19 11 23 31 8 27 3 15 20 18 12 24 32 9 28 6 16 21 13 25 33 29 10 7 18 22 14 26 34 31 11 8 19 23 27 35 15 12 9 32 20 24 16 28 36 33 10 13 21 25 17 29 37 34 11 14 22 26 18 31 38 35 18 12 15 23 27 32 16 19 13 39 36 24 28 20 33 40 37 14 18 25 29 21 34 41 38 15 19 26 31 22 35 42 39 16 20 27 32 23 36 43 40 18 21 28 33 24 37 44 41 19 22 29 34 25 38 45 42 20 23 31 35 26 39 46 43 21 24 32 36 27 40 47 44 22 25 33 37 28 41 48 45 23 26 34 38 29 42 49 46 24 27 35 39 30 43 25 28 47 36 31 40 26 29 48 44 37 32 41 27 31 49 45 38 42 33 46 28 32 39 43 34 47 29 33 40 44 35 48 31 34 41 45 36 49 32 35 42 37 46 33 36 43 38 47 34 37 44 39 48 35 38 45 40 49 36 39 46 41 37 40 47 42 38 41 48 43 39 42 49 44 40 43 45 41 44 46 42 45 47 43 46 48 44 47 49 45 48 46 49 47 48 49

19

16

39

34

19

25

39

26

19

1

25

26

26

19

1

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

25

5 37 37 38 38 44 44 44 38 38 46 46 37 37 37 37 37 37 37 23 23 23 2325 2925 45 45 23 23 23 23 23 23 23 23 23 23 23 23

16

39 10 34

34

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 1 23 2 24 3 25 6 26 7 27 8 28 9 29 10 31 11 32 12 33 13 34 14 35 15 36 16 37 18 38 19 39 20 40 21 41 22 42 23 43 24 44 25 45 26 46 27 47 28 48 29 49 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

16

26 26

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 1 31 2 32 3 33 6 34 7 35 8 36 9 37 10 38 11 39 12 40 13 41 14 42 15 43 16 44 18 45 19 46 20 47 21 48 22 49 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

38

47

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

25 25

38

38 10 38

47

25 25

38

47

16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

25

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 8 22 23 24 25 26 27 28 29 31 18 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 38 46 47 48 49

47

1 2 3 6 7 8 9 10 11 12 13 14 8 15

25

25 25

18

18

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 14 45 18 46 47 48 49

38 38 38 38 38 38 23 23 23 23 23 49 49 49 49 49 49 49

7

5

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 1 29 2 31 3 32 6 33 7 34 8 35 9 36 10 37 11 38 12 39 13 40 14 41 15 42 16 43 18 44 19 45 20 46 21 47 22 48 23 49 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

23 23 38 38

7

5

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

23 23 23

2929

5

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 10 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

23 23 23

3829

5

1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

3838

3838

26 26 48 27 4828 48 29 1 48 31 38 2 32 38 3 33 36 6 34 36 7 35 36 8 36 36 9 37 36 10 38 38 11 38 39 12 38 40 13 38 41 14 38 42 15 36 43 16 36 44 18 36 45 19 36 46 20 36 47 2136 48 2236 49 2336 24 36 25 3 26 3 27 3 28 3 29 3 31 332 33 34 335 3 25 136 26 337 2638 28 4339 47 43 40 43 41 43 42 43 43 43 44 43 45 43 46 43 47 43 48 4943 43 43 43 47 43 36 36 36 36 36 36 36 23 23 29 29 2925 29 29 32 29 3629 27 29 29 29 22 29 22 22 2922 23 27 2923 23 23 23 23 23 5

36 37 38 39 40 41 42 43 44 45 46 47 48 49

8 12 41 44

14

14

14

3838

23 2338 23 23 23 23 1 23 23 2 23 3 38 6 38 38 7 38 8 3738 9 3838 10 36 11 36 12 36 13 24 24 14 15 2924 16 31 29 18 29 19 29 36 20 36 21 4 22 4 23 4 24 4 10 25

39

27

25

1 2 32 13 24 32 35 32 46 57 32 68 32 79 8 32 10 9 11 26 10 12 11 26 13 12 14 2613 15 14 16 32 15 17 16 18 32 17 19 18 32 20 19 1 21 32 20 2 22 21 3 23 25 22 6 24 23 7 25 24 8 26 25 9 27 26 10 28 6 27 11 29 28 12 30 6 29 13 31 630 14 32 31 15 33 6 32 16 34 33 18 6 35 34 19 36 6 35 20 25 37 36 21 30 38 37 22 39 38 23 30 40 39 24 41 25 40 25 42 41 26 43 42 3227 44 43 4428 45 44 29 46 45 31 47 46 32 48 47 33 49 48 34 49 35

37 3838

2338 23 49 49 49 49 49 46 46 23 23 23 23 23 23 23 38 38

27

18

5

38 3737

3838 3838

39

17

18

5

10 25

3838 38 3838

27

18

25 25 25

38

23 25

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 25 49

23 25

1 2 3 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 25 46 47 48 49

38 1 2 3 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27 28 29 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

8 9

25

38

32

Fig. B1. Matching communities with second level features. Each node represents a program of the system, while its color marks its feature affiliation. See the node text for cases of more than one features assigned. The edges are the calls the programs make while the circular groups represent their assigned communities. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

A. Kicsi, V. Csuvik and L. Vidács et al. / The Journal of Systems and Software 155 (2019) 70–90

References Al-Hajjaji, M., Krieter, S., Thüm, T., Lochau, M., Saake, G., 2016. IncLing: efficient product-line testing using incremental pairwise sampling. In: Proceedings of the 2016 ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences - GPCE 2016. ACM Press, New York, New York, USA, pp. 144–155. Al-Hajjaji, M., Thüm, T., Meinicke, J., Lochau, M., Saake, G., 2014. Similarity-based prioritization in software product-line testing. In: Proceedings of the 18th International Software Product Line Conference on - SPLC ’14. ACM Press, New York, New York, USA, pp. 197–206. Al-msie, R., Seriai, a.D., Huchard, M., Urtado, C., 2012. An Approach to Recover Feature Models from Object-Oriented Source Code. Technical Report. Al-Msie’Deen, R., Seriai, A., Huchard, M., Urtado, C., Vauttier, S., Salman, H.E., 2013. Feature location in a collection of software product variants using formal concept analysis. In: ICSR: International Conference on Software Reuse. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7925. Springer, Berlin, Heidelberg, pp. 302–307. Al-msie’deen, R., Seriai, A.-D., Huchard, M., Urtado, C., Vauttier, S., 2013. Mining features from the object-oriented source code of software variants by combining lexical and structural similarity. In: 2013 IEEE 14th International Conference on Information Reuse & Integration (IRI). IEEE, pp. 586–593. Al-Msie’Deen, R., Seriai, A.D., Huchard, M., Urtado, C., Vauttier, S., 2013. Mining features from the object-oriented source code of software variants by combining lexical and structural similarity. In: Proceedings of the 2013 IEEE 14th International Conference on Information Reuse and Integration, IEEE IRI 2013. IEEE, pp. 586–593. Assunção, W.K.G., Lopez-Herrejon, R.E., Linsbauer, L., Vergilio, S.R., Egyed, A., 2017. Multi-objective reverse engineering of variability-safe feature models based on code dependencies of system variants. Empir. Softw. Eng. 22 (4), 1763–1794. Assunção, W.K.G., Vergilio, S.R., 2014. Feature location for software product line migration. In: Proceedings of the 18th International Software Product Line Conference on Companion Volume for Workshops, Demonstrations and Tools - SPLC ’14. ACM Press, New York, New York, USA, pp. 52–59. Bagheri, E., Ensan, F., Gasevic, D., 2012. Decision support for the software product line domain engineering lifecycle. Autom. Softw. Eng. 19 (3), 335–377. Ballarin, M., Lapeña, R., Cetina, C., 2016. Leveraging feature location to extract the clone-and-own relationships of a family of software products. In: Proceedings of the 15th International Conference on Software Reuse: Bridging with Social-Awareness - Volume 9679. Springer-Verlag New York, Inc., pp. 215–230. Baresi, L., Quinton, C., 2015. Dynamically evolving the structural variability of dynamic software product lines. 10th International Symposium on Software Engineering for Adaptive and Self-Managing Systems. Bashari, M., Bagheri, E., Du, W., 2017. Dynamic software product line engineering: a reference framework. Int. J. Softw. Eng. Knowl. Eng. 27 (02), 191–234. Bencomo, N., Lee, J., Hallsteinsen, S., 2010. How dynamic is your Dynamic Software Product Line? In: DiVA Project (EU FP7 STREP), pp. 61–67. Blondel, V.D., Guillaume, J.-L., Lambiotte, R., Lefebvre, E., 2008. Fast unfolding of communities in large networks. J. Stat. Mech. 2008 (10), P10 0 0. Capilla, R., Bosch, J., Trinidad, P., Ruiz-Cortés, A., Hinchey, M., 2014. An overview of Dynamic Software Product Line architectures and techniques: observations from research and industry. J. Syst. Softw. 91 (1), 3–23. Catal, C., Cagatay, 2009. Barriers to the adoption of software product line engineering. ACM SIGSOFT Softw. Eng. Notes 34 (6), 1. Classen, A., Hubaux, A., Sanen, F., Truyen, E., Vallejos, J., Costanza, P., De Meuter, W., Heymans, P., Joosen, W., 2008. Modelling variability in self-adaptive systems: towards a research agenda. In: Proceedings of International Workshop on Modularization, Composition and Generative Techniques for Product-Line Engineering, 1, pp. 19–26. Clements, P., Krueger, C., 2002. Eliminating the adoption barrier. IEEE Softw. 19, 29–31. Clements, P., Northrop, L., 2001. Software Product Lines: Practices and Patterns. Addison-Wesley Professional. Clements, P.C., Jones, L.G., McGregor, J.D., Northrop, L.M., 2006. Getting there from here: a roadmap for software product line adoption. Commun. ACM 49 (12), 33. Deerwester, S.C., Dumais, S.T., Landauer, T.K., Furnas, G.W., Harshman, R.A., 1990. Indexing by latent semantic analysis. J. Am. So. Inf.Sci. 41 (6), 391–407. Eyal-Salman, H., Seriai, A.D., Dony, C., 2013. Feature-to-code traceability in a collection of software variants: combining formal concept analysis and information retrieval. In: Proceedings of the 2013 IEEE 14th International Conference on Information Reuse and Integration, IEEE IRI 2013, pp. 209–216. Eyal-Salman, H., Seriai, A.-D., Dony, C., Al-msie’deen, R., 2012. Recovering traceability links between feature models and source code of product variants. In: Proceedings of the VARiability for You Workshop on Variability Modeling Made Useful for Everyone - VARY ’12. ACM Press, New York, New York, USA, pp. 21–25. Falessi, D., Cantone, G., Canfora, G., 2010. A comprehensive characterization of NLP techniques for identifying equivalent requirements. In: Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement - ESEM ’10. ACM Press, New York, New York, USA, p. 1. Fischer, S., Linsbauer, L., Lopez-Herrejon, R.E., Egyed, A., 2014. Enhancing clone-and-own with systematic reuse for developing software variants. In: 2014 IEEE International Conference on Software Maintenance and Evolution. IEEE, pp. 391–400. Fortunato, S., 2010. Community detection in graphs. Phys. Rep. 486 (3), 75–174.

89

Perrouin, G., Sen, S., Klein, J., Baudry, B., le Traon, Y., 2010. Automatic and scalable t-wise test case generation strategies for software product lines. In: Proc. of Intl. Conf. on Software Testing, 215483. Gamez, N., Fuentes, L., Troya, J.M., 2015. Creating self-adapting mobile systems with dynamic software product lines. IEEE Softw. 32 (2), 105–112. Hamilton, J., Danicic, S., 2012. Dependence communities in source code. In: Software Maintenance (ICSM), 2012 28th IEEE International Conference on. IEEE, pp. 579–582. Harrison, J.V., Lim, W.M., 1998. Automated reverse engineering of legacy 4GL information system applications using the ITOC workbench. In: 10th International Conference on Advanced Information Systems Engineering. Springer-Verlag, pp. 41–57. Haslinger, E.N., Lopez-Herrejon, R.E., Egyed, A., 2011. Reverse engineering feature models from programs’ feature sets. In: 18th Working Conference on Reverse Engineering. IEEE, pp. 308–312. Henard, C., Papadakis, M., Perrouin, G., Klein, J., Heymans, P., Le Traon, Y., 2014. Bypassing the combinatorial explosion: using similarity to generate and prioritize t-wise test configurations for software product lines. IEEE Trans. Softw. Eng. 40 (7), 650–670. Hill, E., Pollock, L., Vijay-Shanker, K., 2007. Exploring the neighborhood with dora to expedite software maintenance. In: Proceedings of the Twenty-Second IEEE/ACM International Conference on Automated Software Engineering - ASE ’07. ACM Press, New York, New York, USA, p. 14. Hinchey, M., Park, S., Schmid, K., 2012. Building dynamic software product lines. IEEE Comput. Soc. 45 (10), 22–26. Horcas, J.-M., Pinto, M., Fuentes, L., 2014. Runtime Enforcement of Dynamic Security Policies. Springer, Cham, pp. 340–356. Kästner, C., Dreiling, A., Ostermann, K., 2014. Variability mining: consistent semi-automatic detection of product-line features. IEEE Trans. Softw. Eng. 40 (1), 67–82. Kelly, M.B., Alexander, J.S., Adams, B., Hassan, A.E., 2011. Recovering a Balanced Overview of Topics in a Software Domain. Kicsi, A., Csuvik, V., Vidács, L., Beszédes, Á., Gyimóthy, T., 2018. Feature level complexity and coupling analysis in 4GL systems. In: ICCSA 2018: Computational Science and Its Applications ICCSA 2018. In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 10964. Springer, Cham, pp. 438–453. Kicsi, A., Vidács, L., Beszédes, A., Kocsis, F., Kovács, I., 2017. Information retrieval based feature analysis for product line adoption in 4gl systems. In: Proceedings of the 17th International Conference on Computational Science and Its Applications – ICCSA 2017. IEEE, pp. 1–6. Klatt, B., Küster, M., 2013. A Graph-Based Analysis Concept to Derive a Variation Point Design from Product Copies. Technical Report. Krueger, C., 2002. Easing the Transition to Software Mass Customization. Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 282–293. Krüger, J., Fenske, W., Meinicke, J., Leich, T., Saake, G., 2016. Extracting software product lines: a cost estimation perspective. In: Proceedings of the 20th International Systems and Software Product Line Conference on - SPLC ’16. ACM Press, New York, New York, USA, pp. 354–361. Lee, J., 2006. A feature-oriented approach to developing dynamically reconfigurable products in product line engineering. In: 10th International Software Product Line Conference, pp. 131–140. Lee, K., Kang, K.C., Lee, J., 2002. Concepts and guidelines of feature modeling for product line software engineering. In: Software Reuse Methods Techniques and Tools, vol. 2319. Springer, Berlin, Heidelberg, pp. 62–77. Lima, C., Chavez, C., de Almeida, E.S., 2017. Investigating the Recovery of Product Line Architectures: An Approach Proposal. Springer, Cham, pp. 201–207. Lohar, S., Amornborvornwong, S., Zisman, A., Cleland-Huang, J., 2013. Improving trace accuracy through data-driven configuration and composition of tracing features. In: Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering - ESEC/FSE 2013. ACM Press, New York, New York, USA, p. 378. M2J Software LLC, Homepage of M2J. http://www.magic2java.com (last visited May 2017). Magic Software Enterprises Ltd., Magic Software Enterprises. http://www. magicsoftware.com (last visited May 2017). Maia, M.A., Sobreira, V., Paixão, K., Amo, S.A., Silva, I.R., 2008. Using a Sequence Alignment Algorithm to Identify Commonalities and Variabilities from Execution Traces. Technical Report. Marcus, A., Maletic, J., 2003. Recovering documentation-to-source-code traceability links using latent semantic indexing. In: 25th International Conference on Software Engineering, 2003. Proceedings. IEEE, pp. 125–135. Marcus, A., Sergeyev, A., Rajlieh, V., Maletic, J.I., 2004. An information retrieval approach to concept location in source code. In: Proceedings - Working Conference on Reverse Engineering, WCRE. IEEE Comput. Soc, pp. 214–223. Mitchell, B.S., Mancoridis, S., 2006. On the automatic modularization of software systems using the bunch tool. IEEE Trans. Softw. Eng. 32 (3), 193–208. Nagy, C., Vidács, L., Ferenc, R., Gyimóthy, T., Kocsis, F., Kovács, I., 2010. MAGISTER: quality assurance of magic applications for software developers and end users. In: 26th IEEE International Conference on Software Maintenance. IEEE Computer Society, pp. 1–6. Nagy, C., Vidács, L., Ferenc, R., Gyimóthy, T., Kocsis, F., Kovács, I., 2011a. Complexity measures in 4gl environment. In: Computational Science and Its Applications - ICCSA 2011, Lecture Notes in Computer Science. Springer Berlin / Heidelberg, pp. 293–309. Nagy, C., Vidás, L., Ferenc, R., Gyimóthy, T., Kocsis, F., Kovás, I., 2011b. Solutions for reverse engineering 4gl applications, recovering the design of a logistical whole-

90

A. Kicsi, V. Csuvik and L. Vidács et al. / The Journal of Systems and Software 155 (2019) 70–90

sale system. In: Proceedings of CSMR 2011 (15th European Conference on Software Maintenance and Reengineering). IEEE Computer Society, pp. 343–346. Ocean Software Solutions, Homepage of Magic Optimizer. http://www. magic-optimizer.com (last visited May 2017). Olszak, A., Jørgensen, B.N., 2009. Remodularizing Java programs for comprehension of features. In: Proceedings of the First International Workshop on FeatureOriented Software Development FOSD 09, pp. 19–26. Paškevicˇ ius, P., Damaševicˇ ius, R., Karcˇ iauskas, E., Marcinkevicˇ ius, R., 2012. Automatic extraction of features and generation of feature models from java programs. Inf. Technol. Control 41 (4), 376–384. Peng, X., Xing, Z., Tan, X., Yu, Y., Zhao, W., 2013. Improving feature location using structural similarity and iterative graph mapping. J. Syst. Softw. 86, 664–676. Poshyvanyk, D., Guéhéneuc, Y.G., Marcus, A., Antoniol, G., Rajlich, V., 2006. Combining probabilistic ranking and latent semantic indexing for feature identification. In: IEEE International Conference on Program Comprehension. IEEE, pp. 137–146. Poshyvanyk, D., Marcus, A., 2007. Combining formal concept analysis with information retrieval for concept location in source code. In: IEEE International Conference on Program Comprehension. IEEE, pp. 37–46. doi:10.1017/ S0 0242829130 0 0583. She, S., Lotufo, R., Berger, T., Wa̧sowski, A., Czarnecki, K., 2011. Reverse engineering feature models. In: Proceeding of the 33rd international conference on Software engineering - ICSE ’11. ACM Press, New York, New York, USA, p. 461. Siegmund, N., Rosenmüller, M., Kuhlemann, M., Kästner, C., Apel, S., Saake, G., 2012. SPL Conqueror: toward optimization of non-functional properties in software product lines. Softw. Qual. J. 20 (3–4), 487–517. Thüm, T., Apel, S., Kästner, C., Schaefer, I., Saake, G., 2014. A classification and survey of analysis strategies for software product lines. ACM Comput. Surv. 47 (1), 1–45. Uchôa, A.G., Bezerra, C.I.M., Machado, I.C., Monteiro, J.M., Andrade, R.M.C., 2017. ReMINDER: An Approach to Modeling Non-Functional Properties in Dynamic Software Product Lines. Springer, Cham, pp. 65–73. Ullah, M.I., Ruhe, G., Garousi, V., 2010. Decision support for moving from a single product to a product portfolio in evolving software systems. J. Syst. Softw. 83, 2496–2512. Valente, M.T., Borges, V., Passos, L., 2012. A semi-automatic approach for extracting software product lines. IEEE Trans. Softw. Eng. 38 (4), 737–754. Valincius, K., Stuikys, V., Damasevicius, R., 2013. Understanding of e-commerce is through feature models and their metrics. In: Proceedings of the IADIS International Conference Information Systems 2013, IS 2013, 8, pp. 55–62. Verner, J., Tate, G., 1988. Estimating size and effort in fourth-generation development. IEEE Softw. 5, 15–22. Wei Zhao, Lu Zhang, Yin Liu, Jiasu Sun, Fuqing Yang, 2004. SNIAFL: towards a static non-interactive approach to feature location. In: Proceedings. 26th International Conference on Software Engineering. IEEE Comput. Soc, pp. 293–303. doi:10.1109/ICSE.2004.1317452. Witting, G., Finnie, G., 1994. Using artificial neural networks and function points to estimate 4GL software development effort. Austr. J. Inf. Syst. 1 (2), 87–94. Xue, Y., Xing, Z., Jarzabek, S., 2010. Understanding feature evolution in a family of product variants. In: Proceedings - Working Conference on Reverse Engineering, WCRE, pp. 109–118. Xue, Y., Xing, Z., Jarzabek, S., 2012. Feature location in a collection of product variants. In: Proceedings - Working Conference on Reverse Engineering, WCRE, pp. 145–154. András Kicsi is a Ph.D. student of computer science at the University of Szeged. He authored multiple papers regarding software product line adoption and information retrieval, his main research interests lie in software engineering and artificial intelligence. His current research topics also involve natural language processing, machine

learning and the aiding of test-to-code traceability. He also works for the University of Szeged as a software developer where he takes part in industrial R&D projects. Viktor Csuvik received his BSc degree in computer science from the University of Szeged, where he currently attends his masters studies. His main research interests lie in software metrics and software product lines. He is also interested in the topics of data visualization and automatic software repair. Besides his academic activities, Csuvik also works as a software developer at the Department of Software Engineering of the University of Szeged. Dr. László Vidács is a senior researcher, deputy head of the MTA-SZTE Research Group on Artificial Intelligence. His research is strongly connected to the Department of Software Engineering, University of Szeged. He received his Ph.D. in 2010 from the University of Szeged. He continuously takes part in R&D&I projects based on academia/industry collaborations with leading national and international companies. He is recently working at the intersection of artificial intelligence and software engineering. He published over 35 scientific papers, including 2 award winner publications. Ferenc Horváth is a Ph.D. student at the Doctoral School of Computer Science, University of Szeged. His main research topics are the analysis and the improvement of software testing processes. He is also interested in fault detection and fault localization, usability testing and software metrics in general. He is a co-author of several publications of these topics. Besides the academic activities, Horváth works as a software developer in R&D projects of the university. Dr. Árpád Beszédes obtained his Ph.D. degree in computer science from the University of Szeged in 2005. His active research area is static and dynamic program analysis including program slicing, but he is involved in other topics as well, such as compiler optimization, software testing, reverse engineering and software quality assurance. He actively participates in several R&D projects at the university, often as a leader. Dr. Beszédes has over 60 publications in the mentioned fields, and is regularly invited to serve in the Program Committees of various software engineering conferences, and as a reviewer for software engineering and computer science journals. He was the Program Co-chair of the SCAM’11 conference. Dr. Beszédes is a founding member of the Hungarian Testing Board, the official member of the International Software Testing Qualifications board. Professor Tibor Gyimóthy is the head of the Software Engineering Department at the University of Szeged in Hungary. His areas of research interests include software evolution, software maintenance and compiler optimization. He was the leader of more than 50 software engineering R&D projects. He has published over 100 articles in refereed conferences and journals. He has served in over 80 program committees including those of the ICSE (4 times), ICSM,CC, CSMR, COMPSAC. He was the conference chair of ESEC/FSE in 2011. He was the chair of the IEEE Computer Society Central and Eastern European Initiatives Committee. Ferenc Kocsis received his BSc in computer science at the University of Szeged in 1982. Until 1994 he was a researcher at the MTA Research Group on Theory of Automata. His main research interests are formal languages, program analyzers and compilers. Until 2001 he worked at an American-Hungarian company developing pharmaceutical systems. His work involved the control of endotoxin measuring appliances and evaluation of gathered data. Since 2001 he is an employee of SZEGED Software Ltd. The profile of the company is the development of logistics software for wholesaler clients, mainly in the field of pharmaceutical industry. For the last ten years, his work actively involved the management of R&D projects.