Engineering Applications of Artificial Intelligence 59 (2017) 103–121
Contents lists available at ScienceDirect
Engineering Applications of Artificial Intelligence journal homepage: www.elsevier.com/locate/engappai
Optimization of rule-based systems in mHealth applications
MARK
⁎
Aniello Minutolo , Massimo Esposito, Giuseppe De Pietro National Research Council of Italy ‐ Institute for High Performance Computing and Networking (ICAR), Via P. Castellino 111, 80131 Naples, Italy
A R T I C L E I N F O
A BS T RAC T
Keywords: Rule-based systems Ontologies Optimization mHealth Reasoning
mHealth applications are becoming more and more advanced, exhibiting capabilities to deliver innovative health services for improving the individual's comfort, enhancing the quality of life, promoting wellness and healthy lifestyle, or improving the adherence to therapies of remotely monitored patients. One of the most relevant components of such applications is represented by rule-based systems able both to reproduce deductive reasoning mechanisms and to explain how their outcomes have been achieved. Unfortunately, the efficiency of rule-based systems, especially on resource-limited mobile devices, rapidly decreases depending on the amount of data satisfying their rules as well as on the size and complexity of the whole rule base. Starting from these considerations, this paper proposes an optimization approach aimed at revising the structure of ontologies and rules built on the top of them that are contained into a rule-based system, with the goal of reducing the cost of evaluation for all its rules, by operating directly at the knowledge level. A general cost model is also presented to estimate the impact of research and identification of available rule instances to execute. Such a model is used to assess impacts and benefits due to the application of the proposed approach to a case study pertaining an mHealth app devised to evaluate eating habits of users in order to take under control their lifestyles and, thus, preserve their wellness. Finally, this theoretical evaluation is also transposed in a practical scenario, where the rule-based system embedded in the considered mHealth app is evaluated on a real smartphone, in terms of memory usage and overall response time. Moreover, a further study has been arranged in order to evaluate the impact of different rule conditions on the cost of evaluation of a knowledge base, and the eventual benefits drawn by their optimization. All the evaluation results show that the proposed approach offers an innovative and efficient solution to drastically reduce the cost of the evaluation of rule instances to execute and, thus, to build mHealth apps able to meet both real-time performance and computation intensive demands.
1. Introduction Rule-based systems are able to reproduce deductive reasoning mechanisms as well as to explain how their outcomes have been achieved, by means of logic production rules made of a conjunction of conditions to verify and a set of actions to execute. Thanks to these capabilities, they have been profitably used in medical settings for encoding the knowledge underlying therapies and/or treatments to follow and, thus, enabling the development of a new generation of healthcare applications able to continuously supporting individuals everywhere and at any time. In fact, the growing penetration of mobile devices, coupled with infrastructures for telecommunication, has deeply influenced the delivery of healthcare services (Malvey and Slovensky, 2014), and defined wider horizons for health through mobile technologies (WHO, 2011). The cheap and widespread availability of mobile phones and wearable devices has enabled the development of new mobile health (mHealth) systems, deployed on
⁎
smartphones provided to the individuals, able to deliver innovative health services for improving the individual's comfort, enhancing the quality of life (Akter et al., 2013), promoting wellness and healthy lifestyle (Knight et al., 2014), or improving the adherence to therapies of remotely monitored patients (Hamine et al., 2015). In this context, the authors experienced the design and development of mobile rule-based components in the Italian project “Smart Health 2.0”, a research and development (R & D) project in which several mHealth applications were designed for supporting healthy users in preserving and taking under control their wellness. The final goal was to provide individuals tailored recommendations about their lifestyle with respect to the estimation of the risk to contract a disease. Several deductive health recommendations were identified in accordance with the peculiar domain of interest on which each mHealth application was focused. In order to achieve the project goals, a hybrid approach was used to model and reason on recommendations about healthy lifestyle and
Corresponding author. E-mail address:
[email protected] (A. Minutolo).
http://dx.doi.org/10.1016/j.engappai.2016.12.007 Received 24 November 2015; Received in revised form 26 July 2016; Accepted 5 December 2016 0952-1976/ © 2016 Elsevier Ltd. All rights reserved.
Engineering Applications of Artificial Intelligence 59 (2017) 103–121
A. Minutolo et al.
In detail, a set of optimization procedures has been defined in order to operate in a pre-processing phase and revise the way ontologymodels and production rules are formalized. Since the way the knowledge is formalized can drastically change the cost of research and identification of rules to execute, these procedures have been devised to reduce the data involved in rules' evaluation and manage intermediate reasoning results directly at the knowledge level, without introducing an excessive overhead due to ad-hoc and complex memory structures. Even if this need of optimizing the way the knowledge is formalized has been generated by operating in mHealth scenarios, they have been formulated with a general basis in order to be also applied to all the scenarios characterized by complex production rules built on the top of highly structured ontology-models. Indeed, these procedures are based on three wide-ranging criteria: (i) diminishing the portion of knowledge base on which rules operate can highly reduce the amount of data involved in the research and identification of rules to execute; (ii) maintaining intermediate results matching pairs of conditions, when a relation exists among them, can avoid to re-inspect combinations of data already classified as invalid for a given pair; (iii) changing the evaluation order of the rule conditions can highly influence the cost of the research and identification of rules to execute. A general cost model has been also proposed to estimate the impact of research and identification of available rule instances to execute. Such a model has been used to evaluate impacts and benefits due to the application of the proposed optimization procedures to a case study pertaining an mHealth app developed in the context of the Italian project “Smart Health 2.0” and devised to evaluate eating habits of users in order to take under control their lifestyles and, thus, preserve their wellness. Finally, this theoretical evaluation has been also transposed in a practical scenario, where the rule-based system embedded in the considered mHealth app has been evaluated on a real smartphone, in terms of memory usage and overall response time. The remainder of the paper is structured as follows. Section 2 introduces some preliminary notions about rule-based systems, reports an overview of the state-of-the-art solutions for their optimization and introduces a general cost model to estimate the impact of research and identification of available rule instances to execute. Section 3 presents the proposed optimization approach. Section 4 describes a theoretical and practical evaluation of the approach on a case study represented by an mHealth app devised to evaluate eating habits of users. Section 5 reports a further performance evaluation arranged according to the Taguchi's experimental design to investigate the general applicability of the proposed approach. Finally, Section 6 concludes the work.
wellness, consisting in the representation of declarative knowledge (i.e. the structure of the domain knowledge) in form of ontology-based models so as to facilitate data integration between heterogeneous data sources characterizing the scenario of interest, and in the representation of procedural knowledge (i.e. the knowledge about the decision making process) as a set of if-then rules built on the top of such models. Unfortunately, the efficiency of rule-based systems rapidly decreases when they are deployed on resource-limited mobile devices. Indeed, since the cost of research and identification of rules to execute, which is the most computation intensive and time-consuming task of the whole reasoning process (Forgy 1979), is directly influenced by the square of the number of assertions satisfying a set of rule conditions, a large amount of data has to be handled and processed as size and complexity of the rule base grow up. Moreover, also the memory usage required to maintain the generated reasoning results increases dramatically so limiting the overall efficiency in mobile devices. Several optimization approaches have been proposed in literature for reducing the cost of research and identification of rules to execute in rule-based systems. Most of them has been thought to desktoporiented applications and has been based on the optimization of the pattern-matching algorithm used to repeatedly compare available assertions with the conditions of the rules. The most famous patternmatching algorithm is RETE (Forgy, 1982), which reduces the number of comparisons to evaluate the satisfaction of a rule by maintaining a cache of intermediate results in memory. More recently, the authors have proposed a mobile pattern-matching algorithm, based on a lazy reasoning approach (Miranker 1990, Weert 2010), specifically designed to be efficiently embedded in mobile devices, by granting lower memory requirements and real-time responsiveness (Minutolo et al., 2015). However, the optimization of the pattern-matching algorithm cannot be enough for enabling an efficient reasoning procedure on mobile devices, in particular when complex and highly structured knowledge bases are involved. In this respect, in the project “Smart Health 2.0”, the authors experienced that also the way the knowledge was formalized affected the cost of research and identification of rules to execute and, accordingly, computational resources and memory consumption. In fact, formalizing known assertions with ontologymodels leads to a structured knowledge base describing many and rich semantic relations among data. Production rules built on the top of such models can contain many conditions in order to increase the rule interpretability, i.e. by mentioning and evaluating the different properties characterizing the semantic data of interest. But, more a rule becomes intelligible with many conditions describing the existing relations among assertions of interest, more the cost required for processing and evaluating the rule increases. Moreover, since ontologymodels are usually composed of classes and properties hierarchically structured, when rule conditions operate on the roots of such hierarchies, the assertions satisfying them can rapidly grow up. For instance, when a rule condition is aimed at verifying that a generic individual is an instance of a class, it will be satisfied by all individuals being instance of that class and, eventually, by all individuals being instance of its inherited classes. Thus, rule interpretability may often lead to a lessening of the efficiency of their evaluation, since little restrictive conditions produce an increase of both the number of results matching them and the number of intermediate results to process when reasoning tests are evaluated. For this reason, when the focus is the optimization of performance, the interpretability of rules should be traded with more efficient rules for reducing the number of reasoning tests to evaluate at each reasoning cycle, and, thus, the amount of required computational and memory resources. Starting from these considerations, this paper proposes an optimization approach aimed at revising the structure of the knowledge base of a rule-based system, with the goal of reducing the cost of evaluation for all its rules.
2. Background and preliminaries Rule-based systems are typically based on a rule engine in charge of repeatedly comparing the conditions of the rules with a set of assertions (known facts), stored in the Working Memory (WM), which provides a description of the current state of the system during the reasoning process. The flow of execution of a rule-based system consists in the research (match phase), identification (select phase) and execution (act phase) of available rule instances, also referred as rule activations (Forgy 1982, Miranker 1987). Each rule activation consists into a pair made of a rule and the set of available WM elements (WMEs), representing the current system state, which satisfies the conditions of that rule (Brant et al., 1993). When a rule activation is identified, the corresponding actions of the considered rule are produced, the WM is consequently updated, and the rule engine is restarted. More formally, denoted with R the set of all production rules stored in a rule base (RB), a rule r ∈ R is defined as the pair r=(LHSr, RHSr), where LHSr and RHSr are respectively the left-hand side and the righthand side of the rule. Both LHSr and RHSr are lists of rule atoms, interconnected among them by means of logical operators. Three types of rule atoms are commonly admitted as conditions in 104
Engineering Applications of Artificial Intelligence 59 (2017) 103–121
A. Minutolo et al.
can be also performed in parallel without any dependence with other conditions. On the contrary, the inter-condition tests involve multiple conditions and evaluate, for each rule, if a variable occurring in a condition atom of its LHS and valued by an intra-condition test is compatible with further assignments of the same variable and occurring into other condition atoms of the same LHS. Resources and effort required for the evaluation of both the intracondition and inter-condition tests are directly influenced by the size of WM, and by the number of WMEs matching each single condition in the LHS of a rule. In this respect, in the following, a general cost model is described, which estimates the operations and the number of tests involved in the research and identification of available rule instances to execute.
the LHS: positive pattern (pP) and negative pattern (nP) for evaluating respectively presence and absence of WMEs matching a given condition (pattern); functions call (FC) for invoking internal procedures able to evaluate logical conditions or to compute arithmetic expressions. Differently, two types of rule atoms are commonly admitted as actions in the RHS, i.e. pP and FC atoms in order to assert and retract facts in the WM. Rule atoms can contain both constant and variable values, with the references to variables indicated using the common convention of prefixing them with a question mark (e.g., ?x). For instance, using Apache Jena (Carroll et al., 2004), a widely adopted Java framework for encoding rules on top of ontology models, a rule asserting that the composition of parent and brother properties implies the uncle property can be written by three pP atoms, as follows:
[rule1: (?f father ?a), pP1
(?u brother ?f) - > pP2
2.1. A general cost model for estimating the complexity of reasoning tests
(?u uncle ?a)] pP3
In more detail, denoted with Pp⊆P the set of pP atoms, and with Pn⊆P the set of nP atoms, the set of all WMEs matching a condition c can be computed as follows:
Given a WM composed of ontology-based assertions expressed as RDF (Graham et al., 2006) triples in the form (subject, predicate, object), and denoted with U the set of constants related to the underlying ontology model, the WM can be denoted as the set {t | t=(s, p, o) ∧ s, p, o ∈ U}. Moreover, denoted with V the set of variable identifiers used within the rules, a triple-based pattern consists of a quadruple (a, stp, ptp, otp), where, a ∈ {−1,1}, and stp, ptp, otp ∈ {U ⋃ V}. When a =1, the pattern defines a pP rule atom. On the contrary, when a =−1, the pattern defines an nP rule atom. It is important to note that, using the syntax of Apache Jena, a positive pattern (1, stp, ptp, otp) can be formalized as (stp ptp otp), whereas a negated pattern (−1, stp, ptp, otp) is expressed as noValue(stp ptp otp). The match phase typically involves two types of tests, as reported in Fig. 1. The intra-condition tests are applied to a single condition c ϵ P with the goal of determining the WMEs satisfying that condition, where P is the set of positive and negative pattern conditions defined in the LHSs of the rules. In general, such tests are independent comparisons applied to a single condition at the time, and, for this reason, they
IF c ∈ Pp , δ1(c) ={ w ∈ WM | w =(s, p, o) ∧ c=(1, cs, pc, oc) ∧ ((s= cs) ∨ (cs ∈ V)) ∧ ((p= cp) ∨ (cp∈ V)) ∧ ((o = co)∨(co ∈ V)) } IF c1 ∈ Pn , δ1(c1) ={ w ∈ WM - δ1(c2) | w =(s, p, o) ∧ c1=(−1, cs, cp, co) ∧ c2=(1, cs, cp, co) ∈ Pp } For example, with respect to the WM reported in Fig. 1, the intracondition tests regarding the condition pP1 will be satisfied only if the set of all WMEs matching that condition is not empty, i.e. {Ø} ≠δ1(pP1) ={ w1∈WM | w1 =(s1, p1, o1) ∧ p1=father ∧ s1,o1 ∈ U }={(Mark, father, John), (Mark, father, Jim)}. Intra-condition tests are commonly performed for evaluating the active status of production rules. A rule is defined active when each pattern condition in its LHS is satisfied by a WME at least. In other words, given a rule r =(LHSr, RHSr) ∈ R,
r is active ⇔ ∀ c ∈ LHSr ⊆ P, card (δ1(c )) > 0 Thus, in order to evaluate the cost associated to the evaluation of
Fig. 1. Research and identification of available rule activations.
105
Engineering Applications of Artificial Intelligence 59 (2017) 103–121
A. Minutolo et al.
intra-condition tests regarding a pattern condition c ∈ P, the number ξ1(c ∈ P) of WMEs comparisons required for composing the set δ1(c) has to be estimated. Generally, at the rule-based system's startup, or in case of inefficient algorithms that do not store any previous result, the number ξ1 of WMEs comparisons will be equal to the cardinality of the working memory, i.e. ξ1(c) = card(WM). As a consequence, the cost Cactive(r ∈ R) associated to the evaluation of the active status of a rule r∈R, can be estimated, in the worst case, as follows:
Cactive(r ∈ R) ∝
∑
Once the set of potential binding environments has been identified, the inter-condition tests are required in order to check the consistency of available variable bindings and, thus, to properly calculate the valid rule activations. In more detail, in case when no variable is present in the LHS of r, Vass is empty, ξ2(r) =1 and, thus only one potential rule activation exists for r, and it is immediately validated and composed with no further investigation to perform. In fact, on the one hand, for each positive pattern condition, only one WME matching it exists and, thus, the data to collect for composing the rule activation are directly represented by the set of specific WMEs matching the positive pattern conditions present in the LHS of r. On the other hand, since no relationship exists among the conditions of r, the data satisfying each single condition do not require to be evaluated for testing the violation of other existing conditions of r and, thus, the unique available rule activation of r is immediately composed and validated. In case when a variable occurs only once in the LHS of a rule r, the check of the consistency of available variable bindings among the other positive pattern conditions is immediately verified, since no relationship exists among them. Thus, denoted with Ar the set of valid activations for a rule r, in case no other rule atom is able to generate variable assignments, all the potential rule activations are also valid, i.e. card(Ar)=ξ2(r), where the set Ar is composed by combining all the WMEs matching the pP atoms as follows:
ξ1(c) ∝ card(WM)*card(LHSr )
c∈ LHSr
Once active rules have been identified, they can be processed in order to find eventual admissible rule activations with the goal of executing them so as to produce fresh knowledge in the WM. The cost of research and identification of admissible rule activations is directly influenced by the structure and the number of rule conditions as well as by the number of variables used in them or shared among them. In detail, given an active rule r, potential rule activations for r will be composed by any set of WMEs that is able to satisfy each single condition of r, without invalidating the other ones contained in its LHS. In other words, given a rule r =(LHSr, RHSr) ∈ R and a set {WME c1, . . , WME cn} | WME ci ∈ δ1(ci), ∀ci ∈ LHSr ∩ Pp, denoted with n
Vass(WME c1, . . , WME cn) = ∪ δ 2(ci , WME ci) the binding environment i=1
containing an ordered list of both variables and values (x1,y1), (x2,y2),…, (xm,ym), established by means of the considered WMEs matching the positive patterns conditions, the set {WME c1,. . ,WME cn} is a potential rule activation for r ⇔
A r∈ R ⎧ (r, Vass(WME c , . . , WME c )) | n = card( {ci | ci ∈ LHSr ∩ Pp} ) ∧ ⎫ 1 n ⎪ ⎪ ⎪ ⎪ ⎬ = ⎨ ∄ (x1, y1), (x2 , y2) ∈ Vass(WME c1, . . , WMEcn) | x1 = x2 ∧ ⎪ ⎪ ⎪ ⎪ y1 ≠ y2 ⎭ ⎩
∄ (x1, y1), (x 2, y2 ) ∈ Vass(WME c1, . . , WME cn) | x1 = x 2 ∧ y1 ≠ y2 As a result, each rule activation act(r) can be identified by a pair (r, Vass) where the binding environment Vass must be consistent, i.e. a variable belonging to Vass can not assume different values. It is important to highlight that, if negated patterns are present in the LHS of a rule, they are not able to add information to the partial set Vass of an activation under investigation, since negated conditions are able to only block or invalidate potential rule instances. On the contrary, positive patterns and/or function calls, when satisfied, are used to validate rule instances and, eventually, add new variable assignments to the partial set Vass. In fact, each positive pattern contained in the LHS of a rule r is potentially able to produce a set of variable assignments for each WME matching it. For instance, with respect to the example depicted in Fig. 1, since the condition pP1 contains two references to variables (i.e., ?f, and ?a), for each element w =(s, p, o) ∈ δ1(pP1), a corresponding variable assignment can be determined (i.e., ?f = s, and ?a = o). In particular, the set δ2(pP1,(Mark, father, John)) of all variable assignments that can be determined by the condition pP1 and its matching fact (Mark, father, John), is δ2(pP1, (Mark, father, John))={(?f, Mark), (?a, John)}. In the same way, δ2(pP1, (Mark, father, Jim))={(?f, Mark), (?a, Jim)}. In general, given a condition c ∈ Pp containing at least one variable reference, it can generate a number card(δ1(c)) of sets of variable assignments and, thus, card(δ1(c)) alternative rule activations to evaluate, one for each WME satisfying the condition c. Instead, in case when no variable is present in c, just one potential activation exists, since c can be satisfied by only one WME. Thus, the total amount of potential rule activations depends on the size of WMEs satisfying each single positive pattern in the LHS of the rule. As a consequence, the number ξ2(r) of potential activations that can be admissible for the execution can be estimated as follows:
ξ2(r ∈ R)
=
∏c∈(LHSr ∩Pp)
Finally, in case one or more variables are present multiple times in different positive patterns conditions, their occurrences have to be consistently joined in order to assume values coherently with the variable assignments already evaluated. In fact, different pP conditions sharing the same variable may produce discordant variable assignments for a given variable, when a discordant set of WMEs is considered for composing a potential rule activation. Thus, all the potential rule activations built using such a discordant set will be invalidated by the pairwise consistency check performed during the inter-condition tests. For instance, with respect to the example presented in Fig. 1, the set Vass((Mark, father, John)pP1, (John, brother, Jim)pP2)={(?f, Mark), (?a, John), (? u, John), (?f, Jim)} is not valid since, for the variable ?f, two discordant assignments are present. In detail, as reported in Fig. 1, even if eight potential activations can be built by combining the WMEs matching the conditions of rule1, only two valid activations exist, i.e. card(Arule1) =2 < ξ2(r) =8. Given a rule r, for each available potential activation of it, a number ξ3(r) of pairwise consistency checks has to be performed in order to determine the validity of its activations. ξ3(r) depends on the number of repetitions of variable references already bound that appear multiple times in different condition atoms able to assign variable values. Thus, ξ3(r) is influenced by the number of available variable assignments δ3(c ∈ C), where C is the set of all conditions defined in the LHSs of the rules, and by the number δ4(r∈R) of final variable assignments that characterize a valid rule activation for r, i.e. if LHSr | act(r) =(r,Vass), δ4(r) = card(Vass). For instance, with respect to the example in Fig. 1, δ3(pP1) =2, δ3(pP2) =2, and δ4(rule1) =3. It is relevant to note that, if a variable reference appears multiple times in a single condition c, each reference may produce a distinct variable assignment. For instance, δ3((?x, brother, ?x))=2, and δ3((?x, ?x, ?y))=3. Moreover, δ3(c) =0 ∀ c∈Pn, since nP conditions are not able to generate variable assignments, but they can eventually use the already defined ones. In detail, ξ3(r) can be computed as follow:
card(δ1(c))
For instance, with respect to the example in Fig. 1, ξ2(rule1) = card(δ1(pP1))*card(δ1(pP2))=2*4=8. 106
Engineering Applications of Artificial Intelligence 59 (2017) 103–121
A. Minutolo et al.
ξ3(r ∈ R ) =
∑
[1 + ξ3(r)]*ξ2(r) = k*ξ2(r) >> Cactive(r), and, thus, Cactivations (r) ⋍ k*ξ2(r) where k > 1. For instance, in order to model a rule to be satisfied by a complex ontological concept characterized by several semantic properties, a corresponding number of positive pattern conditions has to be formulated, with each one of them dedicated to one of those properties. As an example, a rule for assigning the skin phototype to a person can formulated as follows:
[δ3(c)] − δ4(r)
c ∈LHSr
In case when no variable is present in the LHS of r and a variable occurs only once in the LHS of a rule r, the number of required consistency checks becomes zero and the cost of activations evaluation is reduced to Cactivations(r) = card(LHSr). As a consequence, the cost Cactivations(r ∈ R) pertaining the evaluation and identification of valid rule activations for a rule r∈R, can be estimated, in the worst case, as follows:
[rule2: (?a rdf:type Person), (a? skin_color black), (?a eyes_color dark_brown), (?a hair_color black) (?a sunburn never) (?a skin_freckles none) > (?a phototype VI)]
Cactivations(r ∈ R) ∝ Cactive(r) + ξ2(r)*[1 + ξ3(r)] The efficiency of rule-based systems is highly impacted both by Cactive(R) and Cactivations(R). The first one rapidly increases as the number of WMEs and the cardinality of the set P increases, because they have to be compared to each other. The second one is directly influenced by the quality of the rule base, since more the rules are intelligible and contain conditions with multiple variable references, more the cost of join operations required for evaluating the potential rules activations increases. Table 1 summarizes the set of supporting functions the authors have introduced to better describe the impacts of intra-condition and inter-condition tests. Some existing pattern-matching optimization approaches proposed in literature were aimed at reducing both Cactive(r) and Cactivations(r) by means of dedicated cache memories for keeping track of partial rule matches during the inter-conditions tests with the goal of computing them only once and storing them for later reuse. However, the overhead introduced for handling such memories may rapidly slow down the system execution, especially when rule-based systems are deployed on resource-limited devices. Indeed, even if latest high-end models of mobile devices are characterized by a total memory size of 1 GB or more, the quantity of memory available to the user applications is, however, limited. A study reported in (D′Aquin et al., 2010) has shown that, when declarative knowledge is modelled via ontology-models describing many and rich semantic relations among data, semantic-enabled reasoning can take several tens or hundreds of KB of memory per elementary data. Moreover, production rules built on the top of ontology-models can contain many conditions with different properties characterizing the semantic data of interest, in order to increase the rule interpretability. In such cases, the factor ξ2(r) can rapidly become intractable and result preponderant with respect to the other contributes of Cactivations (r), i.e.
Thus, the number of both conditions containing variable references and of repeated references that are involved rapidly increases, implying an augment of Cactivations (r) ⋍ k*ξ2(r) for the modelled rule. In such cases, in addition to the possibility of maintaining a complex cache of intermediate results, other existing optimization approaches proposed dynamic reorganizations of rule conditions, based on heuristics strategies, in order to reduce the number of comparisons required to detect invalid rule activations. However, also these approaches introduced a high overhead that, in dynamic and non-monotonic scenarios, may rapidly make the rule-based system unable to produce timely outcomes. Summarizing, when production rules are built on the top of ontology-models and involve many conditions with different properties characterizing the semantic data of interest, in order to increase the rule interpretability, the most important issue emerged is the need of reducing the factor ξ2(r), for all the rules of a production system, without introducing an additional overhead during its execution, especially in case when it must be deployed on mobile devices, characterized by resource-limited configurations, and applied to mHealth scenarios where promptly responses should be granted. This implies that some optimization and rule reformulation strategies should be conceived in order to improve the overall knowledge base quality with respect to ξ2(r) and, contextually, reduce the costs Cactivations(r) of evaluation and identification of rule instances also in the case when complex and highly structured ontological knowledge bases are used. All these considerations represent the rationale for the approach proposed in this work.
Table 1 Functions defined for evaluating the impacts of intra-condition and inter-condition tests. Definition
Description
Example
δ1:P→WM
δ1(c∈P) denotes the set of all WMEs matching the condition c
c = (?f father ?a) δ1(c) = {Mark, father, John),(Mark, father, Jim)}
δ2:(c ∈ P, w ∈ δ1(c))→{(v ∈ V, u ∈ U)}
δ2(c,w) denotes the set of all variable assignments that can be determined by means of w
c = (?f father ?a) δ2(c,(Mark, father, John)) = {(?f, Mark),(?a, John)}
δ3:C→N
δ3(c) denotes the number of variable assignments that can be generated by means of c
c = (?f father ?a) δ3(c)=2
δ4:R→N
δ4(r) denotes the number of variable assignments that characterize each valid rule activation of r ξ1(c∈P) denotes the number of WMEs comparisons required for composing the set δ1(c)
[r: (?f father ?a),(? u brother ?f)- > (? u uncle ?a)] δ4(r)=3 ξ1(pP1)= card(WM)=10
ξ2:R→N
ξ2(r) denotes the total amount of potential rule activations to inspect since admissible for the execution
[r: (?f father ?a),(? u brother ?f)- > (? u uncle ?a)] ξ2(r) = card(δ1((?f father ?a)))* card(δ1((? u brother ? f))) = 2*4 = 8
ξ3:R→N
ξ3(r) denotes the number of checks required for evaluating the consistency of a potential activation of r
[r: (?f father ?a),(? u brother ?f)- > (? u uncle ?a)] ξ3(r)=1
ξ1:P→N
107
Engineering Applications of Artificial Intelligence 59 (2017) 103–121
A. Minutolo et al.
means of a static criteria to be applied in a pre-processing phase before the reasoning system is executed. Indeed, since the usage of runtime metrics to be computed is not efficient due to an additional overhead for the system execution, rule conditions could be classified and reordered by operating on structural properties characterizing the conditions. In the following, for each of these criteria, some optimization procedures will be introduced and described in detail.
3. The proposed optimization approach The proposed approach relies on a set of optimization strategies aimed at revising the structure of the knowledge base of a rule-based system, with the goal of reducing Cactivations(r) for all its rules when they are evaluated. This goal, in case of complex and highly structured ontological knowledge bases where production rules are built on the top of ontology-models and involve many conditions for increasing the rules interpretability, can be achieved by reducing ξ2(r) since Cactivations (r) ⋍ k*ξ2(r) with k > 1. As preliminary remark, it is important to highlight that the global NP-hard complexity of the match phase cannot be reduced. Indeed, given a rule r, any existing combination of WMEs matching it has to be executed until the rule is hold. Thus, the impact of the term ξ2(r) cannot be eliminated since the necessity of researching and evaluating all the available rule activations of r cannot be eluded to assure the validity of the reasoning process. However, the way the knowledge is formalized can drastically change the cost of reasoning tests, since a proper knowledge formalization can avoid to inspect all potential WMEs combinations when it is not necessary. In order to achieve this goal, three main criteria have been applied. The first one is based on the idea that diminishing the portion of knowledge base on which rules operate can highly reduce the WMEs involved in reasoning tests. In fact, the more rules are specialized with respect to ontological classes and properties used in their conditions, the less they are matched by fresh WMEs inferred at runtime. Such a way, a smaller number of new potential activations is requested to be inspected. The second criterion is based on the idea that intermediate results matching pairs of conditions, when a relation exists among them, should be maintained in order to avoid to re-inspect WMEs combinations already classified as invalid for a given pair and, thus, reduce the cost related to the evaluation of rule activations. In order to avoid to use ad-hoc and complex memory structures to address this issue, due to an excessive overhead not adequate for mobile and dynamic scenarios, intermediate results should be maintained and handled directly at the knowledge base level. Indeed, for each invalid intermediate activation act(rsub | LHSrsub ⊆LHSr) generated by combining only the WMEs matching a subset of conditions, any complete activation act(r) containing act(rsub) will be invalidated as well, whatever other condition is present in the set {LHSr-LHSrsub} due to the pairwise consistency checks computed on LHSrsub. In other words, any subset of invalid intermediated results will be responsible of invalidating any complete activation containing it. For instance, given four positive pattern conditions pP1, pP2, pP3, pP4, and denoted with n the number of intermediate activations that have to be discharged due to pP1 and pP2, any further condition included in the LHS of the rule increases the number of activations that will be invalidated, i.e. the amount of discharged activations is, at least, n*card(δ1(pP3))*card(δ1(pP4)). Thus, more positive pattern conditions are present in the LHS of a rule r, more resources are potentially wasted in the evaluation of activations containing previously invalidated WMEs combinations. Finally, the last criterion is based on the idea that also the evaluation order of the rule conditions can highly influence the cost of the reasoning tests. Changing the evaluation order of the conditions do not alter the portion of WMEs involved during the reasoning, i.e. the factor ξ2(r) remains the same, but inspecting some condition types before others has been shown in literature to highly reduce the time required for recognizing invalid activations. In particular, in literature, classification and reordering of rule conditions are usually performed by means of some heuristics that are applied at runtime and evaluate the WMEs satisfying the rule conditions. Differently, in order to preserve runtime resources, especially on resource-limited mobile devices, rule conditions should be ordered by
3.1. Specializing rules operating on the top of ontology-models The first optimization procedure discussed in the following is essentially aimed at elaborating both rules and ontological models on which they are built by specializing high-level relations or simplifying sets of verbose rule conditions without altering the overall meaning. In other words, new rules can be defined, each one operating on a determined sub-set of the existing knowledge and, thus, on a different sub-set of the initial portion of WMEs. 3.1.1. Optimization procedure O1 This procedure specializes rules containing atoms with properties that make use, in their domain or range, of high-level classes from which at least two subclasses are inherited and represented in the ontology model. More formally, given an ontology-model O, the set Cl of classes defined in O, the set I of individuals of classes Cl, the set P of properties defined in O, a property p ∈ P involving two classes Cd ∈ Cl and Cr ∈ Cl in its domain and range, respectively, defines restrictions on the kinds of individual that can be used for its instantiation. In particular, the axiom ∃p.⊤⊑Cd restricts the domain of p to individuals of Cd, whereas the axiom ⊤⊑∀p.Cr restricts the range of p to individuals of Cr. The property p can be instantiated on couples of individuals in the form p(id, ir), where id is an individual of Cd or of a subclass of it, and ir is an individual of Cr or of a subclass of it. According to the syntax of the Jena language, a typical structure of a rule, whose LHS contains the property p that makes use of the class Cd (Cr) in its domain (range) is reported as follows: [rlhsd: (?a rdf:type Cd), (?a p ?b) ,….- > …] ([rlhsr: (?b rdf:type Cr), (?a p ?b), ….- > ….]) where ?a and ?b are references to variables, and the condition (?a rdf:type Cd) ((?b rdf:type Cr)) is named type-condition (hereafter, tc) since the property rdf:type is used to verify that a generic individual is instance of the class Cd(Cr). On the other hand, a typical structure of a rule, with its RHS including the property p that uses the class Cd (Cr) in its domain (range) is reported as follows:
[rrhsd: (?a rdf:type Cr),….- > (?a p ?b)…] ([rrhsr: (?b rdf:type Cr),….> (?a p ?b),….]) In case when the property p occurs at least in one rule of the rule base considered and, a set of at least two classes is subsumed, in the ontology O, by the class Cd (Cr), where every individual of Cd (Cr) is instance of at least one specialized class and no individual can be instance of two specialized classes at the same time, the procedure O1 specializes p, defined on individuals of Cd (Cr), in terms of subproperties, whose domain (range) is restricted to individuals of the specialized classes of Cd (Cr). More formally, denoted with c and a, respectively, a condition and an action occurring in the LHS and the RHS of a rule, the procedure O1 operates on a property p with respect to its domain as follows:
∃r ∈ R | ( c=(?a p ?b) ⊆ LHSr ∨ a=(?a p ?b) ⊆ RHSr ) ∧ 108
Engineering Applications of Artificial Intelligence 59 (2017) 103–121
A. Minutolo et al.
∃Cd1, Cd2,…Cdn ∈Cl , with n > 1 | (Cd1, Cd2,…Cdn ⊑Cd) ∧ (Cd ⊑ Cd1 ⊔ Cd2 ⊔ …⊔ Cdn) ∧ (Cdi ⊓Cdj ⊑⊥ ∀i, j = 1,..,n ∧ i ≠ j) ⇒ create (pd1, pd2,… pdn ⊑ p | ∃pd1.⊤⊑Cd1, ∃pd2.⊤⊑Cd2, …∃pdn.⊤⊑Cdn) ∧ create (pdk(id, ir), ∀pdk ⊑ p ∧ ∀id ∈I | Cdk(id) ∧ p(id, ir), with k = 1, …,n)
On the other hand, in the case when the property is present in the RHS of a rule and its specializations operate with respect to the classes of its domain, the procedure works as follows:
∀r ∈ R | (∃a=(?a p ?b) ∈ RHSr ) ∧ (∃tc=(?a rdf:type Cd) ∈ LHSr ) ∧ p ⊒ pd1, pd2,… pdn ⇒ create (rdk,∀pdk ⊑ p | LHSrdk = LHSr – {tc} ∪ {tcdk=(?a rdf:type Cdk)} ∧ RHSrdk= RHSr – {a} ∪ {cdk=(?a pdk ?b)}, with k = 1,…,n) ∧ remove r
On the other hand, it operates on the property p with respect to its range as follows: ∃r ∈ R | ( c=(?a p ?b) ⊆ LHSr ∨ a=(?a p ?b) ⊆ RHSr ) ∧ ∃Cr1, Cr2,…Crm ∈Cl , with m > 1 | (Cr1, Cr2,…Crm ⊑Cr) ∧ (Cr ⊑ Cr1 ⊔ Cr2 ⊔ …⊔ Crn) ∧ (Cri ⊓Crj ⊑⊥ ∀i, j = 1,..,m ∧ i ≠ j) ⇒ create (pr1, pr2,… prm ⊑ p | ∃pr1.⊤⊑Cr1, ∃pr2.⊤⊑Cr2, …∃prm.⊤⊑Crm) ∧ create (prk(id, ir), ∀prk ⊑ p ∧ ∀ir ∈I | Crk(ir) ∧ p(id, ir), with k = 1, …,m)
whereas, in case when the its specializations operate with respect to the classes of its range:
∀r ∈ R | (∃a=(?a p ?b) ∈ RHSr ) ∧ (∃tc=(?b rdf:type Cr) ∈ LHSr ) ∧ p ⊒ pr1, pr2,… prm ⇒ create (rrk,∀prk ⊑ p | LHSrrk = LHSr – {tc} ∪ {tcrk=(?b rdf:type Crk)} ∧ RHSrrk= RHSr – {a} ∪ {crk=(?a prk ?b)}, with k = 1,…,m) ∧ remove r
More generally, the procedure O1 first determines all the properties defined in the ontology O and occurring in at least one rule, whose domains are represented by one or more classes subsuming other ones. Then, for each property identified, a set of sub-properties is generated as described above by taking into account separately every possible class indicated is its domain. Successively, the procedure O1 determines all the properties defined in the new version of the ontology O, including also the new specialized ones, whose ranges are represented by one or more classes subsuming other ones. Also in this case, each property identified is specialized into a set of new sub-properties by considering every class indicated in its range separately. In both the cases, for each couple of individuals related by means of a high-level property, a specific sub-property is instantiated according to the domain (range) class each of the two individuals belongs. After restructuring the ontology O, the procedure examines all the rule base and, for each rule considered, the lists of atoms contained in both its LHS and RHS are evaluated. For each atom containing a property that has been previously specialized by the procedure, the rule considered is split into a set of new rules, each of them replacing the old property with one of its new specializations. More formally, in case when the property is present in the LHS of a rule and its specializations operate with respect to the classes of its domain, the procedure works as follows:
The appliance of the procedure O1 implies a reduction of the cost of reasoning tests in case when it impacts on Cactivations and, in particular, on the factor ξ2. More in detail, the specialization of a rule in case when a set of subsumed properties replaces a high-level one occurring in its RHS does not imply any change on Cactivations, since it does not alter the set of potential activations to be evaluated. This kind of specialization is, however, performed in case when the high-level property considered also occurs also in the LHS of another rule and, thus, is replaced with a set of subsumed properties. Such a way, indeed, the coherency of the whole rule base is guaranteed and the chaining of rules with respect to atoms contained in both their RHS and LHS is still made possible, since high-level properties are replaced everywhere by their specialized properties. On the other hand, the specialization of a rule in case when a set of subsumed properties replaces a high-level one occurring in its LHS impacts on Cactivations, since it modifies the set of potential activations to be evaluated. In more detail, since each rule considered is split into a set of new rules, each of them replacing a high-level property with one of its new specializations and a type-condition with a high-level domain (range) class with its subclasses, the function ξ2 can be calculated as follows:
∀r ∈ R | (∃c=(?a p ?b)∈LHSr ) ∧ (∃tc=(?a rdf:type Cd) ∈LHSr ) ∧ ∃p ⊒ pd1, pd2,… pdn ⇒ create (rdk,∀pdk ⊑ p | LHSrdk = LHSr – {c, tc} ∪ {cdk=(?a pdk ?b), tcdk=(?a rdf:type Cdk)} ∧ RHSrdk = RHSr, with k = 1,…,n) ∧ remove r
∀r∈ R | (∃c=(?a p ?b)∈LHSr ) ∧ (∃tc=(?a rdf:type Cd) ∈LHSr ) ∧ (∃p ⊒ pd1, pd2,… pdn) ∧ (∃Cd1, Cd2,…Cdn ∈Cd) ⇒ ξ2(r) = card(δ1(c))*card(δ1(tc))*ξ2(LHSr −{c}−{tc}) n ⎛ n ⎞ ξ2⎜⎜ ∑ rdk⎟⎟ = ∑ [card(δ1(cdk ))*card(δ1(tcdk ))*ξ2(LHSrdk −{cdk }−{tcdk })] ⎝ k=1 ⎠ k=1
whereas, in case when the its specializations operate with respect to the classes of its range:
n
= ξ2(LHSr −{c}−{tc})* ∑ [card(δ1(cdk ))*card(δ1(tcdk ))]
∀r ∈ R | (∃c=(?a p ?b)∈LHSr ) ∧ (∃tc=(?b rdf:type Cr) ∈LHSr ) ∧ ∃p ⊒ pr1, pr2,… prm ⇒ create (rrk,∀prk ⊑ p | LHSrrk = LHSr – {c, tc} ∪ {crk=(?a prk ?b), tcrk=(?b rdf:type Crk)} ∧ RHSrrk = RHSr, with k = 1,…,m) ∧ remove r
k=1
with ξ2(LHSrdk − {cdk } − {tcdk }) = ξ2(LHSr − {c} − {tc}) ∧ ξ2({Ø}) = 1
Since, in accordance with the assumptions that (i) for each couple of individuals related by means of a high-level property p, a single subproperty pdk (prk) is instantiated according to the domain (range) class Cd (Cr) each of the two individuals belongs, and (ii) every individual of Cd (Cr) is instance of at least one specialized class and no individual can
109
Engineering Applications of Artificial Intelligence 59 (2017) 103–121
A. Minutolo et al.
be instance of two specialized classes at the same time, then ∀k=1,…, n, n card(δ1(cdk)) =αdk*card(δ1(c)), with ∑k =1 αdk =1 and card(δ1(tcdk)) n =βdk*card(δ1(tc)), with ∑k =1 βdk =1, As a result, the function ξ2 calculated after the appliance of the procedure O1 can be expressed as follows:
⎛ n ⎞ ξ2⎜⎜ ∑ rdk⎟⎟ = ⎝ k=1 ⎠
pP9 - > (?a canAskForRefund ?c)]
n
∑ [(αdk*βdk)]*card(δ1(c))*card(δ1(tc))*ξ2(LHSr −{c} − {tc}), k=1 n
n
with
- > (?a canAsk ForRefund ?c)] [rr2: (?a rdf: type Patient),
∑
αdk = 1 ∧
∑ βdk = 1
The function ξ2 calculated before the appliance of the procedure O1 can be also expressed depending on αdk and βdk as reported below, n n since ∑k =1 αdk =1 ∧ ∑k =1 βdk =1: n
ξ2(r) = =
=
ξ2(rr1∪rr2) = card(δ1(pP5)) * card(δ1(pP6)) * card(δ1(pP7)) * card(δ1(pP8)) + card(δ1(pP9)) * card(δ1(pP10)) * card(δ1(pP11)) * card(δ1(pP12)) = card(δ1(pP1)) * card(δ1(pP4)) * [card(δ1(pP6))*card(δ1(pP7)) + card(δ1(pP10))*card(δ1(pP11))] = card(δ1(pP1)) * card(δ1(pP4)) * [αr1*card(δ1(pP2))*βr1*card(δ1(pP3)) + αr2*card(δ1(pP2)) * βr2*card(δ1(pP3))]
n
∑ αdk*card(δ1(c))* ∑ βdk*card(δ1(tc))*ξ2(LHSr −{c}−{tc}) k=1 n
k=1 n
∑ αdk* ∑ βdk*card(δ1(c))*card(δ1(tc))*ξ2(LHSr −{c}−{tc}) k=1 k=1 n n ⎛ n ⎞ ∑k=1 αdk* ∑k=1 βdk *ξ2⎜⎜ rdk⎟⎟ n ∑k=1 (αdk*βdk) ⎝ k=1 ⎠
∑
n
n
⎛ n ⎞ ⇒ ξ2⎜⎜ ∑ rdk⎟⎟ < ξ2(r) ⎝ k=1 ⎠ n
Since ∑k =1 (αdk*βdk) < ∑k =1 αdk* ∑k =1 βdk , the relation between the function ξ2 calculated before and after the appliance of the procedure O1 implies a lessening of the function ξ2 and, as well, of Cactivations. As an example, consider an ontology containing the property assumesDrug, whose domain and range include the classes Patient and Drug, respectively. The class Drug is further specialized into two subclasses, namely Antiviral and Antibiotic. Moreover, consider the following rule r1:
[r1: (?a rdf:type Patient),
(?c rdf:type Drug),
pP1 - > (?a canAsk ForRefund ?c)]
pP2
since card(δ1(pP5)) = card(δ1(pP9)) = card(δ1(pP1)) and card(δ1(pP8)) = card(δ1(pP12)) = card(δ1(pP4)). In case when, for instance, the individuals of the classes Antibiotic and Antiviral are equal to 70% and 30% of the number of individuals of the class Drug (i.e. αr1 =0.7 and αr2 =0.3), whereas the couples of individuals verifying the conditions pP7 and pP11 are equal to 60% and 40% of the number of couples of individuals verifying the condition pP3 (i.e. αr1 =0.6 and αr2 =0.4), ξ2(rr1∪rr2) assumes the following value:
(?a (?c isEssential assumesDrug ? true) c), pP3 pP4
ξ2(rr1∪rr2) = 2 * 3 * [(0.7*10) * (0.6*5) + (0.3*10) * (0.4*5)] = 162 << 300 = ξ2(r1) As a result, Cactivations(rr1∪rr2) results actually inferior than Cactivations(r1). 3.2. Extracting rules for handling intermediate knowledge
With respect to this rule r1, the total amount of potential activations can be estimated as:
The second and third optimization procedures here discussed essentially analyze initial rules in order to recognize intermediate knowledge to be handled in new dedicate rules so as to reuse their outcomes for reducing the number of potential activations to inspect. In other words, for each rule, LHS is inspected in order to identify pairs of positive pattern conditions that are closely related to each other, and that can be brought outside of the rule in order to create a novel dedicated production rule in charge of asserting intermediate results eventually produced by the inter-conditions tests applied to such conditions. The procedures applying this strategy are the following.
ξ2(r1) = card(δ1(pP1)) * card(δ1(pP2)) * card(δ1(pP3))*card(δ1(pP4)). In case when, for instance, card(δ1(pP1)) =2, card(δ1(pP2)) =10, card(δ1(pP3)) =5, and card(δ1(pP4)) =3, then ξ2(r1) =300. The appliance of the procedure O1 to reduce ξ2(r1) first restructures the ontology and, in particular, examines the property assumesDrug with respect to its domain and range. Since its domain includes only the class Patient that is not further specialized, no subproperty is generated. On the contrary, since its range contains the class Drug that is further specialized into the subclasses Antiviral and Antibiotic, the property assumesDrug is specialized into the subproperties assumesAntiviral and assumesAntibiotic. Successively, the procedure splits the rule r1 into a couple of two new rules rr1 and rr2, where the property assumesDrug is replaced by the properties assumesAntiviral and assumesAntibiotic, respectively, and the type condition pP2 is also modified accordingly. As a consequence, the new rules rr1 and rr2 are:
[rr1: (?a rdf:type Patient) pP5
(?c (?a rdf:type assumesAntibiotic Antibiotic), ?c), pP6 pP7
(?a (?c assumesAntiviral ? isEssential c), true) pP11 pP12
With respect to these new rules rr1 and rr2, the total amount of potential activations can be estimated as
k=1
k=1
(?c rdf: type Antiviral), pP10
3.2.1. Optimization procedure O2 This procedure exploits the direct relation established by special variable references, here defined as direct control variables, in the rules and generates further ad-hoc rules where the relation is processed and eventual intermediate results are asserted in the knowledge base. Direct control variables are variable references assigned with a value and are shared between two positive pattern conditions of LHS, but are not present also in RHS. More formally, given a variable reference x∈V, and denoted with x∈c the situation when x is contained in a condition c∈P, i.e. in its subject or its predicate or its object, x is defined a direct control variable for the rule r applied to two conditions c1, c2, i.e. directControl(x, r, c1, c2), when the following requirements are satisfied:
(?c isEssential true) pP8 110
Engineering Applications of Artificial Intelligence 59 (2017) 103–121
A. Minutolo et al.
directControl(x, r, c1, c2) ⇔ x∈V ∧ r=(LHSr, RHSr)∈R ∧ c1, c2 ∈ LHSr ∩ Pp ∧ x∈c1 ∧ x∈c2 ∧ [ (x∈RHSr ∧ ∃c∈{LHSr-{c1}-{c2}} | x∈c ∧ c∈Pp) ∨ (x∉RHSr ∧ ∄c3∈{LHSr-{c1}-{c2}} | x∈c3 ∧ c3∉Pp) ∨ (x∉RHSr ∧ ∃c3∈{LHSr-{c1}-{c2}} | x∈c3 ∧ c3∉Pp ∧ ∃c4∈{LHSr{c1}-{c2}-{c3}} | x∈c4 ∧ c4∈Pp) ].
card(δ1(c)) < k*ξ2(rc1c2) depends on the potential rule activations to evaluate due to the other conditions involved in the rule but, considering the worst case scenario, i.e. when k=0.5, the application of the procedure O2 implies a lessening of the factor ξ2 as soon as the number of actual valid rule activations for the rule rc1c2 , i.e. card(δ1(c)), becomes lower than the half of all potential rule activations to inspect for the rule rc1c2 . More formally, sufficient condition for effectively reducing of ξ2 calculated before and after the appliance of the procedure O2, is the following:
Such a way, direct control variables have the characteristic of being able to filter among the potential rule activations to inspect. The procedure O2 searches for pairs of positive pattern conditions of LHS which are related by a direct control variable. When a direct control variable is identified for a pair of conditions, these latter can be removed from the LHS of the rule without altering the consistency of the other conditions. As a consequence, since the revised rules operate on the intermediate results asserted in the knowledge base, they examine only valid results of the considered relation. In more detail, the procedure O2 operates as follows:
if card(δ1(c)) < 0. 5*ξ2(rc1c2) ⇒ ξ2(rc1c2 ∪rr−c1c2 ) < ξ2(r ) Since card(δ1(c)) ≤ ξ2(rc1c2) is satisfied in general, the above constraint will become as less restrictive as the term k increases. In the best case scenario, i.e. k ⋍ 1 and card(δ1(c)) < k*ξ2(rc1c2) ⋍ ξ2(rc1c2), the procedure O2 will be able to reduce Cactivations as soon as at least one potential activation of the rule rc1c2 is recognized as invalid. This situation often happens when production rules are built on the top of ontology-models and involve many conditions containing variable references making such conditions able to be satisfied by a lot of facts in the WM. In such scenario, since the factor ξ2(LHSr-{c1}-{c2}) usually results much greater than 1 when card(LHSr-{c1}-{c2}) > 0, the constraint abovementioned can be approximated as following:
∀r ∈ R | (∃ p1, p2 ∈P ∧ [ (∃ c1=(?x p1 ?y)∈LHSr ∧ c2=(?z p2 ?y)∈LHSr ) ∨ (∃ c1=(?y p1 ? x)∈LHSr ∧ c2=(?y p2 ?z)∈LHSr ) ∨ (∃ c1=(?x p1 ?y)∈LHSr ∧ c2=(?y p2 ?z)∈LHSr ) ∨ (∃ c1=(?y p1 ? x)∈LHSr ∧ c2=(?z p2 ?y)∈LHSr ) ] ∧ directControl(?x, r, c1, c2) ∧ card{LHSr-{c1}-{c2}} > 0 ∧ ξ2(LHSr{c1}-{c2}) > 1) ⇒ create (rc1c2 | LHSc1c2 = {c1, c2} ∧ RHSc1c2 = (?x p1_p2 ?z) ) ∧
ξ2(rc1c2 ∪rr−c1c2 ) < ξ2(r ) ⇔ card(δ1(c)) < ξ2(rc1c2) This constraint is often satisfied, since, in such scenario, the number of potential rule activations to inspect is usually much greater than the actual valid ones and, thus, the procedure O2 can be effectively used for lessening the function ξ2 and, as well, of Cactivations. As an example, consider the following rule r1:
create (rr−c1c2 | LHSr−c1c2 = (?x p1_p2 ?z) ∪ LHSr – {c1, c2} ∧ RHSr−c1c2 = RHSr ∪ remove(?x p1_p2 ?z) ) ∧ remove r It is important to note that, on the one hand, the proposed approach is only feasible if the rule language supports primitives for non-monotonic reasoning. On the other hand, it is important to highlight the necessity to remove the results produced by the execution of the rule rc1c2 , in order to preserve the consistency of the knowledge base, i.e. the rule rr − c1c2 first processes intermediate results and then eliminates them from the knowledge base since they are meaningful only locally for this couple of rules.The appliance of the procedure O2 implies a reduction of the cost of reasoning tests since it usually causes a reduction of ξ2. In more detail, since each rule considered is split into a set of new rules, each of them exploiting the available relation established by the control variables of the rule, the function ξ2 can be calculated as follows:
[r1: (?a rdf:type Patient), pP1
(?a affectedByIllness ?b), pP2
(?c rdf:type Drug), pP3
(?c indicatedForIllness ?b) - > pP4
In case when, for instance, card(δ1(pP1)) =10, card(δ1(pP2)) =10, card(δ1(pP3)) =100, and card(δ1(pP4)) =100, then, the cost of the evaluation and identification of valid rule activations in the rule r1 can be estimated as ξ2(r1) = card(δ1(pP1))*card(δ1(pP2))*card(δ1(pP3))* card(δ1(pP4)) =106. The variable ?b constitutes a direct control variable for the rule r1 applied to pP2 and pP4, i.e. directControl(?b, r1, pP2, pP4) holds. Applying the procedure O2 to the rule r1 via the control variable directControl(?b, r2, pP3, pP4), the rule r1 can be accordingly split in the following rules r1a and r1b:
ξ2(r ) = ξ2(rc1c2 )*ξ2(LHSr-{c1}-{c2}) ξ2(rc1c2 ) = card(δ1(c1))*card(δ1(c2)) ξ2(rr−c1c2 ) = ξ2(LHSr-{c1}-{c2})*card(δ1(c)) with c = RHSc1c2 ξ2(rc1c2 ∪rr−c1c2 )= ξ2(rc1c2 )+ξ2(rr−c1c2 ) < ξ2(r )
[r1a: (?a affected ByIllness ? b), pP2 [r1b: (?a rdf:type Patient)
⇔ ξ2(rc1c2 ) + ξ2(LHSr-{c1}-{c2})*card(δ1(c)) < ξ2(rc1c2 )*ξ2(LHSr-{c1}{c2}) ⇔ ξ2(LHSr-{c1}-{c2})*card(δ1(c)) < ξ2(rc1c2 )*[ξ2(LHSr-{c1}-{c2}) - 1] ⇔ card(δ1(c)) < ξ2(rc1c2)*(1− ξ
1 2(LHSr −
{c1} − {c 2})
⇔ card(δ1(c)) < k*ξ2(rc1c2) with k =1− ξ
) 1
2(LHSr −
{c1} − {c 2})
where 0.5 ≤ k < 1 since ξ2(LHSr-{c1}-{c2}) > 1, and k→1 when (LHSr{c1}-{c20 >> 1 Generally speaking, the effective reduction of ξ2 calculated before and after the appliance of the procedure O2 depends on the ratio between potential rule activations to inspect for the rule rc1c2 , i.e. ξ2(rc1c2), and the actual valid ones that will generate, when executed, an equal number of facts to be asserted in the WM and able to satisfy the condition c of the rule rr−c1c2 , i.e. card(δ1(c)). In addition, the constraint
pP1
(?c indicatedForIllness ?b) - > pP4 (?c rdf:type Drug),
pP3
(?a affected ByIllness_indicated ForIllness ?c)]
(?a affected (?a can ByIllness_indicated Assume ForIllness ?c) - > Drug ? c)] pP5
Now, denoted with n the number of valid activations for the rule rr1a, the card(δ1(pP5)) will be at most equal to n and, thus, the cost of the evaluation and identification of valid rule activations for the rules r1a and r1b can be estimated as ξ2(rr1a∪rr1b) = card(δ1(pP2))*card(δ1(pP4)) + card(δ1(pP1))*card(δ1(pP3))*n =103 + 111
Engineering Applications of Artificial Intelligence 59 (2017) 103–121
A. Minutolo et al.
n*103. Thus, the procedure O2 is able to effectively reduce ξ2 only if 103 + n*103 < 106 ⇔ n < 103 - 1. As discussed above, this condition is typically satisfied when rules contains many variable references, as in the considered example, and the entity of reduction obtained in ξ2 will be as much greater as lower the value of n is. For instance, in case when just a portion of all available drugs are indicated for the specific illness affecting a patient, for instance 30% of available ones, the number of valid activations for the rule rr1a will be n =0.3* card(δ1(pP2)) * card(δ1(pP4)) =300 and, thus, the condition n =300 < 103 – 1 is satisfied.
removed conditions can be managed in new ad-hoc rules where their intermediated results are generated, and the rule r1 can be accordingly revised in order to examine only valid results of the removed indirect relation. As a result, the following general optimization procedures can be defined. For each rule, the procedure O3 searches for triples of conditions in its LHS, which are indirectly related via an indirect control variable. When an indirect control variable is identified for a triple of conditions, these latter can be removed from the LHS of the rule without altering the consistency of the other conditions. In detail, the procedure O3 operates as follows:
3.2.2. Optimization procedure O3 This procedure exploits the indirect relation established by pairs of special variable references, here defined as indirect control variables, in the rules and generates further ad-hoc rules where the relation is processed and eventual intermediate results are asserted in the knowledge base. Indirect control variables are pairs of variable references contained in a logic comparator of LHS, and shared between the logic comparator and two positive pattern conditions of LHS, but are not present also in RHS. More formally, denoted with FCLC ={LE(lessEqual), GE(greaterEqual), Equal, greaterThan, lessThan} the set of logic comparators supported by the rule language, two variable references (y∈V, w∈V) identify an indirect control variable for the rule r applied to the conditions c1, c2 via the function condition c3, denoted with indirectControl(y, w, r, c1, c2, c3), when the following requirements are satisfied:
∀r ∈ R | (∃ p1, p2 ∈P ∧ [(∃ c1=(?x p1 ?y)∈LHSr ∧ c2=(?z p2 ?w)∈LHSr ) ∨ (∃ c1=(?y p1 ? x)∈LHSr ∧ c2=(?w p2 ?z)∈LHSr ) ∨ (∃ c1=(?x p1 ?y)∈LHSr ∧ c2=(?w p2 ?z)∈LHSr ) ∨ (∃ c1=(?y p1 ? x)∈LHSr ∧ c2=(?z p2 ?w)∈LHSr )] ∧ [(∃ c3=f(?y,?w)∈LHSr∩FCLC) ∨ (∃ c3=f(?w,?y)∈LHSr∩FCLC)] ∧ indirectControl(?y, ?w, r, c1, c2, c3) ∧ card{LHSr-{c1}-{c2}-{c3}} > 0 ∧ ξ2(LHSr-{c1}-{c2}-{c3}) > 1 ⇒ create (rc1c2c3| LHSc1c2c3 = {c1, c2, c3} ∧ RHSc1c2c3= (?x p1_p2_f ?z) ) ∧ create (rr−c1c2c3 | LHSr−c1c2c3 = (?x p1_p2_f ?z) ∪ LHSr – {c1, c2, c3} ∧ RHSr−c1c2c3= RHSr ∪ remove(?x p1_p2_f ?z) ) ∧ remove r
indirectControl(y, w, r, c1, c2, c3) ⇔ y∈V ∧ w∈V ∧ r=(LHSr, RHSr)∈R ∧ c1, c2 ∈ LHSr ∩ Pp ∧ y∈c1 ∧ w∈c2 ∧ c3=f(y,w) ∈ LHSr∩FCLC ∧ [ (y∈RHSr ∧ ∃c∈{LHSr-{c1}-{c2}-{c3}} | y∈c ∧ c∈Pp) ∨ (y∉RHSr ∧ ∄c4∈{LHSr-{c1}-{c2}-{c3}} | y∈c4 ∧ c4∉Pp) ∨ (y∉RHSr ∧ ∃c4∈{LHSr-{c1}-{c2}-{c3}} | y∈c4∧c4∉Pp ∧ ∃c5∈{LHSr-{c1}-{c2}-{c3}} | y∈c5 ∧ c5∈Pp) ] ∧ [ (w∈RHSr ∧ ∃c∈{LHSr-{c1}-{c2}-{c3}} | w∈c ∧ c∈Pp) ∨ (w∉RHSr ∧ ∄c4∈{LHSr-{c1}-{c2}-{c3}} | w∈c4 ∧ c4∉Pp) ∨ (w∉RHSr ∧ ∃c4∈{LHSr-{c1}-{c2}-{c3}} | w∈c4 ∧ c4∉Pp ∧ ∃c5∈{LHSr-{c1}-{c2}-{c3}} | w∈c5∧c5∈Pp) ].
As discussed for the rules generated by the procedure O2, in order to preserve the consistency of the knowledge base, i.e. the rule rr−c1c2c3 processes the intermediate results generated by the rule rc1c2c3, and, then, eliminates them from the knowledge base. Moreover, similarly to the procedure O2, the appliance of the procedure O3 implies a reduction of the cost of reasoning tests since it usually causes a reduction of ξ2 in semantic scenarios where productions rules are built on top of ontological models and contain many conditions characterized by distinct variable references. In more detail, since each rule considered is split into two new rules, each of them exploiting the relation established by the indirect control variables of the rule, the function ξ2 can be calculated as follows:
As an example, consider the following rule r1:
equal(? (?a related (?c (?c rdf: (?a [r1: (?a rdf: indCaloric b, ?d) - Indication type caloric type ?c)] > Patient), Need Indication), Need ?d), ?b), pP1 pP2 pP3 pP4 fc1
ξ2(r ) = ξ2(rc1c2c3)*ξ2(LHSr-{c1}-{c2}-{c3}) ξ2(rc1c2c3) = card(δ1(c1))*card(δ1(c2)) ξ2(rr−c1c2c3) = ξ2(LHSr-{c1}-{c2}-{c3})*card(δ1(c)) with c = RHSc1c2c3 ξ2(rc1c2c3∪rr−c1c2c3) = ξ2(rc1c2c3)+ξ2(rr−c1c2c3) < ξ2(r )
The two conditions pP2 and pP4 do not share any variable references and thus, it seems that the potential rule activations generated by each condition do not directly influence the ones generated by the other conditions. However, the variable references ? b and ?d are logically compared via the function fc1=equal(?b, ?d), which is satisfied only when the value assumed by ?b is equal to the value assumed by ?d, and, thus, the function fc1 establishes an indirect relation between the two conditions pP2 and pP4. Now, since ?b and ?d are not used anymore in the rule rr1 and, in particular, they are not used to assert new knowledge in its RHS, the indirectControl(?b, ?d, r1, pP2, pP4, fc1) holds, and the pair (?b, ?d) filters all available combinations of WMEs by selecting only the ones generating a coherent variable assignments for the pair (?b, ?d). Thus, indirect control variables have the characteristic of being able to filter among the potential rule activations to inspect. For this reason, the function fc1, and the two conditions pP1 and pP2 can be removed from the LHS of the rule r1 without altering the consistency of the other conditions, the
⇔ ξ2(rc1c2c3) + ξ2(LHSr-{c1}-{c2}-{c3})*card(δ1(c)) < ξ2(rc1c2c3)*ξ2(LHSr-{c1}-{c2}-{c3}) ⇔ ξ2(LHSr-{c1}-{c2}-{c3})*card(δ1(c)) < ξ2(rc1c2c3)*[ξ2(LHSr-{c1}{c2}-{c3}) - 1] ⇔ card(δ1(c)) < ξ2(rc1c2c3)*(1− ξ
2(LHSr −
1 ) {c1} − {c 2} −{c3})
⇔ card(δ1(c)) < k*ξ2(rc1c2c3) with k =1− ξ
2(LHSr −
1 {c1} − {c 2} −{c3})
where 0.5≤ k < 1 since ξ2(LHSr-{c1}-{c2}-{c3}) > 1, and k→1 when ξ2(LHSr-{c1}-{c2}-{c3}) >> 1. Similarly to the procedure O2, sufficient condition for effectively causing a reduction of ξ2 calculated before and after the appliance of the procedure O3, is the following:
if card(δ1(c)) < k*ξ2(rc1c2c3) ⇒ ξ2(rc1c2c3 ∪ rr−c1c2c3) < ξ2(r) with k = 0. 5
112
Engineering Applications of Artificial Intelligence 59 (2017) 103–121
A. Minutolo et al.
Since card(δ1(c)) ≤ ξ2(rc1c2c3) is satisfied in general, the above constraint will become as less restrictive as the term k tends to 1. In the best case scenario, i.e. k ⋍ 1 and card(δ1(c)) < k*ξ2(rc1c2c3) ⋍ ξ2(rc1c2c3), the procedure O3 will be able to reduce Cactivations as soon as at least one potential activation of the rule rc1c2c3 is recognized as invalid. This situation often happens when production rules are built on the top of ontology-models and involve many conditions containing variable references making such conditions able to be satisfied by a lot of facts in the WM. In such scenario, since the factor ξ2(LHSr-{c1}-{c2}{c3}) usually results much greater than 1 when card(LHSr-{c1}-{c2}{c3}) > 0, the constraint card(δ1(c)) < k*ξ2(rc1c2c3) can be approximated as following:
ate instantiations decreases as well. 3.3.1. Optimization procedure O4 This procedure is based on the idea that structural properties characterizing the positive pattern conditions of rules can be used for determining the most restrictive conditions, i.e. the ones likely containing, at runtime, the least WMEs satisfying them, and, thus, the ones able to reduce the time required for recognizing invalid activations. As a consequence, as first step, the procedure O4 classifies positive pattern conditions of rules as restrictive or volatile conditions. Then, as a second step, the appliance of the procedure O4 produces the ordering of conditions in the LHSs of the rules according to the determined classification. First of all, as classification index, this procedure introduces the restrictive factor (RF), with the aim of estimating the level of restriction applied by conditions to their satisfying WMEs. More formally, given a positive pattern condition c, and an element x∈{U ⋃ V} of c, the element x is a structure-based restriction for c, hereafter denoted with sbRestriction(x, c), when the following requirements are satisfied:
ξ2(rc1c2c3 ∪ rr−c1c2c3) < ξ2(r) ⇔ card(δ1(c)) < ξ2(rc1c2c3) This constraint is often satisfied, since, in such a scenario, the number of potential rule activations to inspect is usually much greater than the actual valid ones and, thus, the procedure O3 can be effectively used for lessening the function ξ2 and, as well, of Cactivations. As an example, considering the rule r1 reported above, in case when, for instance, card(δ1(pP1)) =10, card(δ1(pP2)) =10, card(δ1(pP3)) =200, and card(δ1(pP4)) =200, then the cost of the evaluation and identification of valid rule activations in the rule r1 can be estimated as ξ2(r1) = card(δ1(pP1)) * card(δ1(pP2)) * card(δ1(pP3)) * card(δ1(pP4)) =4*106. The pair of variable reference ?b and ?d constitutes an indirect control variable via the function fc1 for the rule r1 applied to pP2 and pP4, i.e. indirectControl(?b, ?d, r1, pP2, pP4, fc1) holds. Applying the procedure O3 to the rule r1 via the control variable indirectControl(?b, ?d, r1, pP2, pP4, fc1), the rule r1 can be split accordingly in the following rules r1a and r1b:
[r1a: (?a caloric Need ?b),
(?c indCaloric Need ?d),
equal(?b, ?d) ->
pP2 [r1b: (?a rdf: type Patient),
pP4 (?c rdf: type Indication),
pP1
pP3
fc1 (?a related (?a caloric Indication ? Need_ind CaloricNeed ? c)] c) - > pP5
sbRestriction(x, c) ⇔ c=(stp,ptp,otp) ∈ Pp | x∈c ∧ [ x∈U ∨ (x∈V ∧ ( x=stp=ptp ∨ x=ptp=otp ∨ x=stp=otp ))] In other words, an element x of a positive pattern condition c is able to restrict the set of WMEs satisfying c only if x is a constant value, or x is a variable reference appearing more than one time in c. Indeed, as described in the previous sections, the most restrictive condition that can be defined is composed by using no variable reference, i.e. stp,ptp,otp ∈ U, since card(δ1(stp,ptp,otp))≤1. On the contrary, the most volatile condition, i.e. the condition with the greatest amount of satisfying elements and with the most probability of being updated, can be defined by using only variable references, i.e. stp∈V, ptp∈{V- stp}, otp∈{V- stp, ptp}, and, thus, δ1(stp,ptp,otp)= WM. As consequence, while each constant value contained in a pP condition surely establishes a restriction on the WMEs matching that condition, a single variable reference does not establish any restrictions, since any data is able to satisfy it. By the way, when variable references are used more than one time in the same condition, they are able to restrict the WMEs matching the whole condition. For instance, if a pP condition contains twice a variable reference ?x, even if the first occurrence of ?x could be potentially satisfied by any data in the WM, the second occurrence of ?x establishes a restriction on the WMEs satisfying pP since they have to contemporary match both the occurrences of ?x. As an example, if c=(?x,pc,?x), then δ1(c) ={w∈WM | w=(s,p,o) ∧ (s=o) ∧ (p=cp) }. Generalizing such considerations, the number nR(c) of structure-based restrictions contained inside a condition c can be formally defined in terms of its structural properties as follows:
(?a caloric Need_ind CaloricNeed ?c)]
Now, denoted with n the number of valid activations for the rule rr1a, the card(δ1(pP5)) will be at most equal to n and, thus, the cost of the evaluation and identification of valid rule activations for the rules r1a and r1b can be estimated as ξ2(rr1a∪rr1b) = card(δ1(pP2))*card(δ1(pP4)) + card(δ1(pP1))*card(δ1(pP3))*n =2*103 +2 n*103. Thus, the procedure O3 is able to effectively reduce ξ2 only if 2*103 +2 n*103 < 4*106 ⇔ n < 2*103 - 1. As discussed above, this constraint is typically satisfied when rules contains many variable references, as in the considered example, and the entity of reduction obtained in ξ2 will be as much greater as lower the value of n is. For instance, in case when just a portion of all available indications are formulated for the same caloric need of the considered patient, for instance 20% of available ones, the number of valid activations for the rule rr1a will be n =0.2* card(δ1(pP2)) * card(δ1(pP4)) =400 and, thus, n =400 < 2*103 - 1 is satisfied.
∀ c ∈ Pp,
nR(c) ≜ card({x ∈ c| x ∈ U}) + [δ3(c )−card ({x ∈ c|x∈ V})]
where, the term card({x∈c | x∈U}) represents the number of constant values contained in a condition c, and the term [δ3(c)-card({x∈c | x∈V})] represents the number of repeated variable references in c. As a result, given a positive pattern condition c, RF of the condition c can be defined as follows:
3.3. Ordering rule conditions
∀ c ∈ Pp, The last optimization procedure here proposed mainly analyzes structural properties of rule conditions in order to determine the ones able to reduce the time required for recognizing invalid activations. For instance, when some restrictive conditions are evaluated before others, the amount of intermediate data involved in the inspection of first conditions are reduced and, thus, the size of the successive intermedi-
RF(c) ≜
nR(c) nR(c) + δ3(c)
The restrictive factor here defined provides structure-based indicator of the level of restriction applied by a positive pattern condition to its satisfying WMEs. Thus, positive pattern conditions can be classified according to their RF, by evaluating their structural properties. Starting from the proposed structured-based classification, the 113
Engineering Applications of Artificial Intelligence 59 (2017) 103–121
A. Minutolo et al.
Fig. 2. Restrictive factor of pP conditions according to their structural properties.
domain experts during the project. Such recommendations have been formulated to describe the portion of daily food that is allowed to meet the average daily caloric need, detect abnormal food portions with respect to the specific user necessities, and, contextually, generate a dietary alert as soon as these anomalies are recognized. Such recommendations have been codified in terms of productions rules and loaded on the mobile device supplied to each user. In order to interpret and execute such rules, the mHealth app has been equipped with a mobile rule-based system, which has been designed and realized as presented in (Minutolo et al., 2015). In order to formally structure the recommendations as rules, an ontology has been designed, including all the main concepts, relations and attributes pertaining the considered scenario, as shown in Fig. 3. In detail, the concept UserState includes the information pertaining the current state of the user, in terms of his/her daily caloric need, the gathered measures, the collection of indications for evaluating the user's habits, and the eventual collection of alerts generated as a response to a not proper habit detected. Measure models a generic observation about the current user state, and it is characterized by its time of acquisition. FoodDiary, as a specialization of Measure, describes the gathered observations about the daily portions of consumed food, which are modelled via the concept Portion. Each portion describes the quantity consumed daily for a given aliment, modelled via the concept Aliment. Indication models a generic recommendation about the user's habit. In particular, FoodIndication describes a diet recommendation about the daily portion of a given aliment for a specific caloric need. AllowedPortion models a range of food amount that is admitted to consume. It is worth noting that each instance of FoodIndication has to be related to an instance of AllowedPortion. Finally, Alert describes an abnormal situation that has to be notified to the user, and that influences the current user state. In particular, FoodAlert models an alert generated when a dietary recommendation is applied. As a consequence, each instance of FoodAlert is generated by composing the food portion violating a given recommendation, the recommendation itself, and the portion describing the admitted amount of food that has been violated. On the top of such an ontology, three production rules have been formulated with the goal of evaluating the current user state and the gathered food diaries, so as to determine if existing dietary recommendations are focused on the caloric need of the user, and if they acts on the aliments consumed by the user. In detail, the formulated rules have been coded according to the Jena rule language, as reported in following:
procedure O4 changes the evaluation order of pattern conditions in LHSs of the rules as reported below.
∀r=(LHSr, RHSr) ∈ R | ∃c ∈ LHSr ∩ Pp ∧ LHSr = {c1, c2, …, cn} with n = card(LHSr), ⇒ ∀ c ∈ LHSr ∩ Pp, RF(c) ≜
nR(c) nR(c) + δ3(c)
create ( rRF | LHSRF = {c′1,c′2,…,c′n} ∧ c′i ∈ LHSr ∧ (RF(c′i) > RF(c′i+1)) ∀(c′i, c′i+1) ∈ Pp ∧ i =1,2,…n-1 ∧ RHSRF = RHSr) ∧ remove r Roughly speaking, for each rule r ∈ R, the procedure O4 reorders the conditions c ∈ LHSr ∩ Pp, by classifying the considered conditions according to the computed RF(c), so as to inspect the most restrictive ones first, i.e. the conditions with highest RF and, thus, reduce the intermediate data involved in the composition of first partial rule matches. As an example, Fig. 2 reports the restrictive factors computed for some positive pattern conditions. As a further example, consider the following rule r:
[r: (?a rdf: type (?a afPatient), fected ByIllness ?b), pP1 pP2
(?c rdf: type Drug), pP3
(?c indicated ForIllness ?b) - > pP4
(?a can AssumeDrug ?c)]
The restrictive factor of the conditions pP1 and pP3 results equal to 0.67 since they contain two constant values and only one variable reference, i.e. RF(pP1)=RF(pP3)=0.67. On the contrary, the restrictive factor of the conditions pP2 and pP4, results equal to 0.33 since they contain only one constant value and two variable references that are not repeated in the condition, i.e. RF(pP2)=RF(pP4)=0.33 and, thus, do not establish any restriction on their satisfying WMEs. As a consequence, the application of the procedure O4 to the rule r produces a restructuring of the rule base as follows:
[rRF: (?a rdf: type Patient), (?c rdf: type Drug), (?a affectedByIllness ?b), (?c indicatedForIllness ?b) - > (?a canAssumeDrug ?c)]
[r1:(?y rdf:type base:UserState), (?y base:userCaloricNeed ?w), (? ind rdf:type base:FoodIndication), (?ind base: referenceCaloricNeed ?fabb), equal(?w,?fabb) - > (?y base:impactedByIndication ?ind)] [r2:(?y rdf:type base:UserState), (?y base:impactedByIndication ? a2), (?a2 rdf:type base:FoodIndication), (?a2 base: referencePortion ?pa), (?pa rdf:type base:AllowedPortion), (?y base:containsMeasure ? x), (?x rdf:type base:FoodDiary), (?x base:consumedPortion ?z), (?pa base:referenceFood ?paAlim),
4. Case of study In this section, the optimization procedures, presented and validated theoretically above, have been applied to a case study pertaining an mHealth app developed in the context of the Italian project “Smart Health 2.0” in order to demonstrate how they can reduce the cost of evaluation of a knowledge base in a real scenario. This mobile app is devised to evaluate eating habits of users in order to take under control their lifestyles and, thus, preserve their wellness. It relies on a set of dietary recommendations, elicited in cooperation with a group of 114
Engineering Applications of Artificial Intelligence 59 (2017) 103–121
A. Minutolo et al.
that are able to satisfy it. However, even if just few tens of facts satisfy a single condition, the cost Cactivations(r1∪ r2∪ r3) for the evaluation and identification of valid rule activations rapidly grows up for the whole rule base since many variable references are involved and connected in each rule. As an example, the following initial configuration for the knowledge base is considered. First, 31 food indications have been formulated, each one applicable for one or more daily caloric needs, from 1600 to 2600 cal. Each food indication declares a corresponding portion of allowed food, expressed as a pair of lower and higher food amount, in grams, admitted for the associated aliment. Moreover, the user state has been modelled by specifying a caloric need equal to 2600 cal, and associating 2 daily portions consumed for 2 distinct aliments. Among the modelled 31 food indications, 19 of them are applicable for the same caloric need of the considered user. As a consequence, the starting knowledge base is composed by 434 triple assertions, i.e. card(WM)=434, and, in detail, the number of facts matching the rules' conditions is reported in Table 2. Considering this initial configuration of the knowledge base, the cost of the evaluation of the active status for the whole rule base can be estimated as Cactive(r1∪r2∪r3) =12152. With respect to the computation
(?z base:consumedFood ?paAlim), (?z base:amount ?zAmount), (?pa base:lowerAllowedAmount ?paLowerAmount), lessThan(? zAmount,?paLowerAmount) - > makeInstance(?y, base:containsAlert, base:LowDietAlert, ? newAl), (? newAl base:involvedMeasure ?x), (? newAl base: involvedPortion ?z), (? newAl base:involvedIndication ?a2), (? newAl base: involvedAllowedPortion ?pa)] [r3:(?y rdf: type base:UserState), (?y base:impactedByIndication ? a2), (?a2 rdf:type base:FoodIndication), (?a2 base:referencePortion ?pa), (?pa rdf:type base:AllowedPortion), (?y base:containsMeasure ? x), (?x rdf:type base: FoodDiary), (?x base:consumedPortion ?z), (?pa base:referenceFood ?paAlim), (?z base:consumedFood ?paAlim), (?z base:amount ?zAmount), (?pa base: higherAllowedAmount ?paHigherAmount), greaterThan(?zAmount,?paHigherAmount) - > makeInstance(?y, base:containsAlert, base:OvereatingAlert, ? newAl), (? newAl base:involvedMeasure ?x), (? newAl base: involvedPortion ?z), (? newAl base:involvedIndication ?a2), (? newAl base:involvedAllowedPortion ?pa)]
Table 2 The amount of initial facts satisfying the rules formulated for the considered case of study.
The rule r1 aims at adding to the user state all the eventual existing recommendations matching the specific caloric need of the user. The rules r2 and r3, instead, are in charge of evaluating the portions of food consumed by comparing them with the quantities daily admitted as stated in the corresponding recommendations. If a daily portion violates the recommended amount, a corresponding FoodAlert is generated by relating to it all the information characterizing the recognized abnormal consumptions. As shown in Fig. 3, the ontology has been conceived in order to initially include specific properties whose domain (range) is restricted to individuals of specialized classes, such as the property involvedIndication, whose domain and range classes are FoodAlert and FoodIndication, respectively. Moreover, all the rules built on the top of this ontology operate directly on these specialized properties, so they are natively codified coherently with the criteria the optimization procedure O1 is based on. The use of conditions with many variable references in the same rule allows to increase the rule base interpretability since, on the one hand, a few number of rules is formulated and, on the other hand, each rule completely describes the different properties characterizing data
Cost Function
Function Argument
Function Value
δ1(c) δ1(c) δ1(c) δ1(c) δ1(c) δ1(c) δ1(c) δ1(c) δ1(c) δ1(c) δ1(c) δ1(c) δ1(c) δ1(c)
(?y rdf:type base:UserState) (?y base:userCaloricNeed ?w) (?ind rdf: type base:FoodIndication) (?ind base:referenceCaloricNeed ?fabb) (?y base:impactedByIndication ?a2) (?a2 base:referencePortion ?pa) (?pa rdf:type base:AllowedPortion) (?y base: containsMeasure ?x) (?x rdf:type base:FoodDiary) (?x base:consumedPortion ?z) (?pa base:referenceFood ?paAlim) (?z base:consumedFood ?paAlim) (?z base:amount ?zAmount) (?pa base:lowerAllowedAmount ? paLowerAmount) (?pa base:higherAllowedAmount ? paHigherAmount)
1 1 31 57 0 31 31 1 1 2 31 2 2 31
δ1(c)
Fig. 3. The ontology devised to model the domain knowledge pertaining the scenario considered.
115
31
Engineering Applications of Artificial Intelligence 59 (2017) 103–121
A. Minutolo et al.
of Cactivations(r1∪r2∪r3), it is possible to note that, only the rule r1 is active since the other ones will become active when r1 is fired and some food indications are, correspondingly, associated to the user state. Thus, while initially Cactivations(r2∪r3) is almost negligible, it rapidly grows up when r1 activations are executed. Denoted with k the number of facts able to satisfy the condition (?y base:impactedByIndication ? a2) after the execution of the rule r1, the impact of the factor ξ2(r) rapidly becomes preponderant as reported in Table 3. In more detail, k is initially equal to 0, but it will become equal to 19 as soon as all the r1 activations are executed, since there are 19 food indications having the same caloric need of the considered user state. As a consequence, the final cost will become Cactivations(r1∪r2∪r3) =113126*106. All the terms involved for calculating this cost are reported in Table 3. It is possible to note that Cactivations(r1∪r2∪r3) is much greater than Cactive(r1∪r2∪r3), since the term ξ2(r) grows exponentially with the number of facts matching the condition elements of the rules. As discussed in previous sections, this aspect is particularly true when rules' conditions contain many variable references. Since r2 and r3 contain several high-level conditions suggesting the proper food indication with many variable references linked among them, such conditions are satisfied by a number of facts equal to all the existing food indications. In order to reduce the impact of ξ2(r), a first optimization can be achieved by analyzing the formulated rules with the goal of reducing the number of variable references used in each rule. In this respect, the proposed optimization procedure O2 has been used, in order to recognize the existence of control variables in the rules, and extract them for demanding their evaluation to new ad-hoc rules. In detail, r2 and r3 contain ?paAlim that constitutes a direct control variable between the positive conditions (?pa base: referenceFood ?paAlim) and (?z base: consumedFood ?paAlim). Thus, the procedure O2 can be applied by introducing the following rule:
zAmount) and (?pa base:lowerAllowedAmount ?paLowerAmount), via the function lessThan(?zAmount,?paLowerAmount). In the same way, the pair (?zAmount,?paHigherAmount) constitutes an indirect control variable for the rule r3 applied to the positive conditions (?z base:amount ?zAmount) and (?pa base:higherAllowedAmount ?paHigherAmount), via the function greaterThan(?zAmount,?paHigherAmount). Thus, applying the proposed optimizations O2 and O3, the rule base of the case of study can be revised as reported in the following:
[r1a:(?y base:userCaloricNeed ?w), (?ind base:referenceCaloricNeed ?fabb), equal(?w,?fabb), - > (?y base:userCaloricNeed_referenceCaloricNeed_equal ?ind)] [r1b:(?y rdf: type base: UserState), (?y base:userCaloricNeed_referenceCaloricNeed_equal ?ind), (?ind rdf: type base: FoodIndication) - > (?y base:impactedByIndication ?ind)] [r2a:(?pa base:referenceFood ?paAlim), (?z base:consumedFood ? paAlim) - > (?pa, base:referenceFood_consumedFood ?z)] [r2b:(?z base:amount ?zAmount), (?pa base:lowerAllowedAmount ? paLowerAmount), lessThan(?zAmount,?paLowerAmount) - > (?z, base:amount_lowerAllowedAmount_lessThan ?pa)] [r2c:(?y rdf:type base:UserState), (?y base:impactedByIndication ? a2), (?a2 rdf:type base: FoodIndication), (?a2 base:referencePortion ?pa), (?pa rdf: type base:AllowedPortion), (?y base:containsMeasure ?x), (?x rdf:type base:FoodDiary), (?x base:consumedPortion ?z), (?pa, base:referenceFood_consumedFood ?z), (?z, base:amount_lowerAllowedAmount_lessThan ?pa) - > makeInstance(?y, base:containsAlert, base:LowDietAlert, ? newAl), (? newAl base: involvedMeasure ?x), (? newAl base:involvedPortion ?z), (? newAl base: involvedIndication ?a2), (? newAl base:involvedAllowedPortion ?pa)] [r3a:(?z base:amount ?zAmount), (?pa base:higherAllowedAmount ? paHigherAmount), greaterThan(?zAmount,?paHigherAmount) - > (?z, base:amount_higherAllowedAmount_greaterThan ?pa)] [r3b:(?y rdf:type base:UserState), (?y base:impactedByIndication ? a2), (?a2 rdf:type base: FoodIndication), (?a2 base:referencePortion ?pa), (?pa rdf:type base:AllowedPortion), (?y base:containsMeasure ?x), (?x rdf: type base:FoodDiary), (?x base:consumedPortion ?z), (?pa, base:referenceFood_consumedFood ?z), (?z, base:amount_higherAllowedAmount_greaterThan ?pa) - > makeInstance(?y, base: containsAlert, base:OvereatingAlert, ? newAl), (? newAl base:involvedMeasure ?x), (? newAl base:involvedPortion ?z), (? newAl base: involvedIndication ?a2), (? newAl base: involvedAllowedPortion ?pa)]
[r2a: (?pa base: referenceFood ?paAlim), (?z base: consumedFood ? paAlim) - > (?pa base: referenceFood_consumedFood ?z)] It is worth highlighting that the application of the procedure O2 to the rule r3 generates again the rule r2a and, thus, it is not inserted into the rule base. Moreover, two further optimizations can be performed by applying the procedure O3 to the rule base. In fact, on the one hand, with respect to the rule r1, the pair (?w,?fabb) constitutes an indirect control variable for the conditions (?y base:userCaloricNeed ?w) and (?ind base:referenceCaloricNeed ?fabb), via the function equal(?w,?fabb). On the other hand, the pair (?zAmount,?paLowerAmount) constitutes an indirect control variable for the rule r2 applied to (?z base:amount ?
Table 3 The terms involved in the computation of Cactivations(r1∪r2∪r3), for the considered case of study. Cost Function
Function Argument
Function Value
Cactive(r) Cactive(r) Cactive(r) ξ2(r)
r1 r2 r3 r1
card(LHSr1∩P)*card(WM)=1736 5208 5208 1717
ξ2(r)
r2
k*229*106
ξ2(r)
r3
k*229*106
ξ3(r)
r1
2
ξ3(r)
r2
12
ξ3(r)
r3
12
Cactivations(r) Cactivations(r) Cactivations(r)
r1 r2 r3
1736+1717*(1+2) 5208+( k*229*106)*(1+12) 5208+( k*229*106)*(1+12)
As a consequence of the application of the proposed optimization procedures, the new rules operate on ad-hoc conditions that are not initially satisfied by any fact in the WM. When such rules are executed, their consequents assert facts in the WM that are able to match such ad-hoc conditions that are initially inactive (see Table 4) and, later, become satisfied by as many facts as there are activations executed. In order to evaluate the impact of the applied optimizations, the revised rule base can be analyzed with the respect to the proposed cost model as discussed in detail in the previous sections. In this respect, Table 4 reports the initial number of facts matching the rules' conditions of the revised rule base. In detail, denoted with x, y, z and w, the number of facts which, after the execution of the ad-hoc rules, will be able to satisfy, respectively, the introduced ad-hoc positive pattern conditions (?y base:userCaloricNeed_referenceCaloricNeed_equal ?ind), (?pa, 116
Engineering Applications of Artificial Intelligence 59 (2017) 103–121
A. Minutolo et al.
the amount of allowed food that is not respected by the consumed portions (i.e. z=6), 44 of them express a higher bound for the amount of allowed food that is not respected by the consumed portions (i.e. w =44), and 19 of them present the same caloric need of the user (i.e. x= k=19). As a consequence, the final cost involved in the computation of the revised rule base can be estimated as Cactivations(r1a∪r1b∪r2a∪r2b∪r2c∪r3a∪r3b)=2038*106. Thus, the application of the proposed optimization procedures to the initial rule base has been able to drastically reduce the cost of the evaluation of the rule activations. In fact, with respect to the proposed cost model, the cost associated to the evaluation of existing rule activations results to be reduced of almost two orders of magnitude if compared to the cost calculated for the initial rule base. Finally, in order to analyze how this theoretical cost reduction can be effectively transposed in a real scenario of execution, the mHealth app developed in the project “Smart Health 2.0” has been deployed on an Android smartphone and its embedded rule-based system has been configured with the knowledge base defined above. In detail, the Android smartphone chosen is reported as follows, with also its hardware and software features specified:
Table 4 The amount of initial facts satisfying the revised rule base. Cost Function
Function Argument
Function Value
δ1(c) δ1(c) δ1(c)
(?y base:userCaloricNeed ?w) (?ind base:referenceCaloricNeed ?fabb) (?y base: userCaloricNeed_referenceCaloricNeed_equal? ind) (?pa base:referenceFood ?paAlim) (?z base:consumedFood ?paAlim) (?z base:amount ?zAmount) (?pa base:lowerAllowedAmount ?paLowerAmount) (?pa base:higherAllowedAmount ? paHigherAmount) (?y base:impactedByIndication ?a2) (?a2 base:referencePortion ?pa) (?y base:containsMeasure ?x) (?x base:consumedPortion ?z) (?pa, base:referenceFood_consumedFood ?z) (?z, base:amount_lowerAllowedAmount_lessThan ?pa) (?z, base: amount_higherAllowedAmount_greaterThan?pa)
1 57 0
δ1(c) δ1(c) δ1(c) δ1(c) δ1(c) δ1(c) δ1(c) δ1(c) δ1(c) δ1(c) δ1(c) δ1(c)
31 2 2 31 31 0 31 1 2 0 0 0
■ Xiaomi Hongmi Redmi 1 s, Android 4.3, 1 GB RAM, CPU QuadCore Snapdragon-400 1.6 GHz;
base:referenceFood_consumedFood ?z), (?z, base:amount_lowerAllowedAmount_lessThan ?pa), and (?z, base:amount_higherAllowedAmount_greaterThan ?pa), impacts and benefits of the applied optimizations are shown in Table 5 and are as higher as the number of activations is lower for the new introduced rules. In detail, with respect to the considered example, there are 31 food indications in the WM and, 3 of them act on the same aliments consumed by the user (i.e. y=3), 6 of them express a lower bound for
The rule-based system embedded in the mHealth app has been evaluated on this real smartphone, in terms of memory usage and overall response time, with respect to three distinct situations: the initial knowledge base with no optimization; the knowledge base optimized using the procedures O2 and O3; the knowledge base optimized using the procedure O4 with conditions ordered according to their restrictive factor. In this respect, Table 6 reports the restrictive factors of the case of study considered, that have been used for reordering the rule conditions. The upper area of Fig. 4 shows the system response time obtained when different configurations of the knowledge base considered are used. It is worth noting that the response time is drastically reduced when the rules are revised by applying the proposed optimization procedures O2 and O3 that are able to reduce the impact of ξ2(r) term in the research and evaluation of available rule activations. Moreover, also the application of procedure O4 by ordering the conditions according to their restrictive factor is able to further reduce the response time, but the experienced benefits are less evident than the extraction of ad-hoc rules for handling intermediate knowledge. Furthermore, as shown in the lower area of Fig. 4, the application of the optimization procedures O2 and O3, causes an increment in the memory usage due to the new ad-hoc rules and conditions to insert in the knowledge base. However, such an increment seems to be not excessive when compared to the higher improvement achieved in the response time of the system. In this respect, it can be useful to evaluate in the real scenario of application, if this trade-off between an improvement in the overall response time and eventual worsening in the memory usage can be accepted or tolerated.
Table 5 The terms involved in the computation of Cactivations(r1a∪r1b∪r2a∪r2b∪r2c∪r3a∪r3b), for the revised rule base. Cost Function
Function Argument
Function Value
Cactive(r) Cactive(r) Cactive(r) Cactive(r) Cactive(r) Cactive(r) Cactive(r) ξ2(r)
r1a r1b r2a r2b r2c r3a r3b r1a
card(LHSr1a∩P)*card(WM)=868 1302 868 868 4340 868 4340 57
ξ2(r)
r1b
x*31
ξ2(r)
r2a
62
ξ2(r)
r2b
62
ξ2(r)
r2c
k*59582*y*z
ξ2(r)
r3a
62
ξ2(r)
r3b
k*59582*y*w
ξ3(r)
r1a
0
ξ3(r)
r1b
2
ξ3(r)
r2a
1
ξ3(r)
r2b
0
ξ3(r)
r2c
11
ξ3(r)
r3a
0
ξ3(r)
r3b
11
Cactivations(r) Cactivations(r) Cactivations(r) Cactivations(r) Cactivations(r) Cactivations(r) Cactivations(r)
r1a r1b r2a r2b r2c r3a r3b
868+57*(1+0) 868+(x*31)*(1+2) 868+62*(1+1) 868+62*(1+0) 2604+(k*59582*y*z)*(1+11) 868+62*(1+0) 2604+(k*59582*y*w)*(1+11)
5. Performance evaluation The case study, discussed in the previous section, showed the adequacy of the proposed optimization procedures when applied to a mHealth scenario. In this section, in order to investigate the general applicability of the proposed approach, a further study has been arranged with the goal of evaluating the impact of different rule conditions on the cost of evaluation of a knowledge base, and the eventual benefits drawn by their optimization. To this aim, three factors, namely the type of conditions to optimize, the number of rules, and their valid activations, have been identified as relevant for being evaluated in order to assess the performance of the reasoning 117
Engineering Applications of Artificial Intelligence 59 (2017) 103–121
A. Minutolo et al.
Table 6 The restrictive factors pP conditions for the considered case of study. pP condition
nC
δ3 (c)
nRV
nR = nC + nRV
RF(c) =
(?y rdf:type base:UserState) (?ind rdf:type base:FoodIndication) (?pa rdf:type base:AllowedPortion) (?x rdf:type base:FoodDiary) (?y base:userCaloricNeed ?w) (?ind base:referenceCaloricNeed ?fabb) (?y base:userCaloricNeed_referenceCaloricNeed_equal ?ind) (?pa base:referenceFood ?paAlim) (?z base:consumedFood ?paAlim) (?z base:amount ?zAmount) (?pa base:lowerAllowedAmount ?paLowerAmount) (?y base:impactedByIndication ?a2) (?a2 base:referencePortion ?pa) (?y base:containsMeasure ?x) (?x base:consumedPortion ?z) (?pa, base:referenceFood_consumedFood ?z) (?z, base:amount_lowerAllowedAmount_lessThan ?pa) (?z, base:amount_higherAllowedAmount_greaterThan ?pa)
2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1
0,67 0,67 0,67 0,67 0,33 0,33 0,33 0,33 0,33 0,33 0,33 0,33 0,33 0,33 0,33 0,33 0,33 0,33
nR nR + δ3(c)
Fig. 4. The comparison of memory usage and response time when reasoning on different configurations of the considered knowledge base.
system in the processing of the underlying knowledge base. This choice is motivated by the following considerations. Firstly, as deeply discussed in the previous section, the structure of the knowledge base and the kind of conditions composing the rules directly influence the amount of information to process when a knowledge base is evaluated for detecting eligible rule activations to execute, since the potential rule activations to inspect are determined by combining the individuals satisfying each single condition with the ones satisfying the other conditions of the same rule. In this respect, since the presence of conditions operating on high-level domain classes and/or the existence of direct control variables in the rules drastically impact on the number of potential activations to inspect, also the conditions being target of the optimization influence the performance of the reasoning system. Secondly, the number of rules to be evaluated affects the search for a single activation of each rule and, thus, the overall performance of the reasoning system. Finally, also the number of valid rule activations influences the cost of evaluation of the knowledge base, since it can impact on the set of active rules to evaluate. In fact, the knowledge inferred by a rule is chained to other rules making them, eventually, active, or inactive. As discussed in the previous sections, some other factors influence the cost of the evaluation of a knowledge base, such as the number of conditions contained in the LHS of the rules, and the number of individuals in the WM satisfying each single condition of the rules. In this respect, in order to take into account these factors without drastically increasing the complexity of the study to perform, the
Table 7 Experimental design factors and their levels. Factor
Optimization Target Rules Valid Activations
LEVEL
1
2
3
HIGH-LEVEL CONDITIONS
DIRECT CONTROL VARIABLES
BOTH
10 2
20 6
30 10
experiments have been designed on the basis of a common rule structure, i.e. each rule is composed of six positive pattern conditions, where two of them are high-level conditions, and other two contain a direct control variable reference for that rule. In accordance with this rule structure, a proper ontology model, composed of about 2000 RDF triples, has been arranged in order to make, on the one hand, each single rule active and containing the same the number of valid activations, and, on the other hand, the number of individuals satisfying each single condition equals to the chosen number of valid activations. According to this strategy, given a number x of valid activations contained in each rule, and considering that for each rule the term ξ3(r∈R) is equal to 6, the cost of evaluation of each rule can be easily determined as follows:
Cactivations(r ∈ R)∝ Cactive(r) + ξ 2(r)*[1+ξ3(r)]=6*2000+6x *7 ⎯⎯⎯→ 6x *7 x >4
Moreover, the ontology model used in the experiments has been 118
Engineering Applications of Artificial Intelligence 59 (2017) 103–121
A. Minutolo et al.
Table 8 The L9 orthogonal array used for testing the factors’ effects and the observed values associated to the presented reasoning system. L9
T1 T2 T3 T4 T5 T6 T7 T8 T9
Factors
Response time
Memory Usage
Optimization Target
Rules
Valid Activations
SN
Mean (ms)
SN
Mean (KB)
high-level conditions high-level conditions high-level conditions direct control variables direct control variables direct control variables both both both
10 20 30 10 20 30 10 20 10
2 6 10 6 10 2 10 2 6
−50,99 −68,53 −95,98 −58,55 −79,65 −53,50 −68,79 −53,51 −56,53
354,33 2671,33 62978,00 846,00 9604,67 473,00 2751,33 473,00 670,67
−27,84 −41,66 −49,40 −40,90 −49,60 −40,51 −44,47 −36,35 −40,40
24,67 121,00 295,00 101,00 302,00 106,00 167,33 65,67 104,67
ratio).
defined so that, on the one hand, the high-level class used in the highlevel conditions can be inherited in two subclasses, and, on the other hand, the fifty percent of individuals belonging the high-level class also belongs to one of its subclass, and the remaining fifty percent belongs to the other subclass. Starting from these rules and ontology model structures, with respect to the performance of the reasoning system in the evaluation of the knowledge base, two metrics have been here considered, namely response time and memory usage, since the reasoning system is strongly required to provide answers within a prescribed time and without exhausting all the available runtime memory. These two metrics have been evaluated with respect to the three factors abovementioned, by identifying a set of repeated experiments according to the Taguchi's experimental design. In detail, well-balanced experiments, with a reduced variance and optimum settings, have been organized by properly combining the factors affecting the performance and the levels at which they should be analysed, by adopting the Taguchi's orthogonal arrays (Taguchi, 1986). In particular, three levels have been considered for each factor, as reported in Table 7. The experimental design factors and their levels have been combined according to the Taguchi's L9 array for arranging nine different experiments to be performed. The impact of these factors on the performance of the reasoning systems have been evaluated with respect to the signal-to-noise ratio (hereafter, SN), computed by conducting three trials for each experiment in the L9 array. In detail, the “SmallerThe-Better” SN ratio has been evaluated, since typically used when the goal is to minimize the performance characteristic (Phadke, 1989). Experiments have been performed on the mobile device abovementioned. The experiments’ layout, the SN ratio and the mean value computed in each experiment with respect to the response time and memory usage of the reasoning system, are reported in Table 8. Once all the SN ratios have been computed for each run, the average SN values for every factor and for each of its levels have been calculated in order to determine the range Δ of the effects generated by the factor on the performance, which is calculated as the difference between maximum and minimum of such average SN values. These ranges, reported in Table 9, can be used to rank the considered three factors and, thus, to determine the ones with the largest effect on the performance (i.e. the maximum alteration induced in the associated SN
⎛1 SNSmaller the better= − 10log⎜⎜ ⎝n
n
⎞ nis the number of trials conducted for each experiment Tj ⎠ y is the measured value in the trial i
∑ yi2⎟⎟ i =1
i
As shown in Table 9, the number of valid activations and, thus, the number of individuals in the WM satisfying each single condition of the rules, has been recognized as more relevant for the performance with respect to the response time of the reasoning system. In other words, the system is highly influenced in evaluating the knowledge base by the cost of the term ξ2(r∈R) described in the previous sections. In detail, the response time of the reasoning system rapidly worsens as soon as the factor valid activations increases. Note that also the number of rules negatively impacts on the performance with respect to the response time of the reasoning system. On the contrary, the optimization target is able to reduce the response time that results drastically improved when high-level conditions and direct control variable references are optimized, even better if both. Moreover, it is possible to highlight that the optimization of direct control variable references showed to be able to reduce the time response of the reasoning system more than the optimization of the high-level conditions, despite the generation of further rules for extracting and manipulating the intermediate knowledge involved in direct control variable references. On the other hand, with reference to the memory usage, the number of valid activations results to be the most impacting factor on the performance of the reasoning system. The two factors optimization target and number of rules showed a comparable influence on the performance but, while a growing number of rules produces an increasing memory consumption, the impact of the factor optimization target is different according to the particular kind of conditions that have been optimized. In this respect, it is possible to highlight that the optimization of high-level conditions produced a less impact on the memory consumption when compared to the impact produced by the optimization of direct control variable references, since this kind of optimization generates a further set of rules for extracting and manipulating the intermediate knowledge involved in direct control variable references. However, even if the optimization of direct control variable references increases the memory consumption, it can be easily accepted due to the great reduction of the response time of the
Table 9 The range of the factors’ effects on the SN ratio, associated to the presented reasoning system. Factors' levels
1 2 3 Δ Rank
SN - Response Time
Factors' levels
Target
Rules
Activations
−71,84 −63,90 −59,61 12,23 3
−58,72 −67,23 −74,74 16,03 2
−52,67 −61,21 −81,48 28,81 1
1 2 3 Δ Rank
119
SN - Memory Usage Target
Rules
Activations
−39,63 −43,40 −40,41 3,77 3
−38,20 −42,54 −44,95 6,75 2
−34,90 −40,71 −47,82 12,92 1
Engineering Applications of Artificial Intelligence 59 (2017) 103–121
A. Minutolo et al.
The strength of the proposed approach with respect to these abovementioned solutions can be summarized in its ability of operating in a pre-processing phase and revising the way ontology-models and production rules are formalized, without introducing dynamic procedures, employed during the system execution and that can affect its performance negatively. This is due to the fact that the proposed approach has been thought for scenarios where high computational procedures can compromise the correct working of the applications and, more in particular, their timely responsiveness. For this reason, it has been focused on improving the quality of the knowledge base before the system starts its execution and without incurring in an additional overhead during the rule evaluation. To the best of our knowledge, no static optimization approach has been proposed in literature to achieve a similar goal. The key issues of such an optimization approach are static procedures essentially based on three wide-ranging criteria: (i) diminishing the portion of knowledge base on which rules operate can highly reduce the amount of data involved in reasoning tests; (ii) maintaining intermediate results matching pairs of conditions, when a relation exists among them, should be can avoid to re-inspect combinations of data already classified as invalid for a given pair; (iii) changing the evaluation order of the rule conditions can highly influence the cost of the reasoning tests. In conclusion, the presented approach offers an innovative and valuable way to sensibly reduce the cost of the evaluation of rule instances to execute. The encouraging evaluation results suggest that the approach could be proficiently utilized to build mHealth apps, able to deliver innovative health services for promoting wellness and healthy lifestyle. Finally, thanks to its general basis, the present approach is undoubtedly applicable also to all the scenarios characterized by complex production rules built on the top of highly structured ontology-models and where real-time performance and computation intensive demands have to be met.
reasoning system that can be obtained through this kind of optimization. In conclusion, the experiments showed that the term ξ2(r∈R) has a high impact on the performance of the reasoning system, providing a significant validation of the cost model theoretically discussed in the previous sections. Consequently, this result also proved that the proposed optimization procedures, that are aimed at reducing term ξ2(r∈R), positively affect the performance of a reasoning system, on the one hand, and have a general basis since proficiently applicable for the optimization of generic knowledge bases, on the other one. 6. Discussion and conclusion This paper presented an optimization approach aimed at revising the structure of ontologies and rules, built on the top of them, contained into a rule-based system, with the goal of reducing the cost of evaluation for all its rules, by operating directly at the knowledge level. A general cost model has been also proposed and applied to evaluate impacts and benefits of the proposed optimization approach on a case study pertaining an mHealth app developed in the context of the Italian project “Smart Health 2.0” and devised to evaluate eating habits of users in order to take under control their lifestyles and, thus, preserve their wellness. This theoretical evaluation has been also transposed in a practical scenario, where the rule-based system embedded in the considered mHealth app has been evaluated on a real smartphone, in terms of memory usage and overall response time. Moreover, a further study has been performed in order to investigate the general applicability of the proposed approach by evaluating the impact of different rule conditions on the cost of evaluation of a generic knowledge base, and the eventual benefits drawn by their optimization. This performance evaluation, arranged according to the Taguchi's experimental design, showed that the proposed optimization procedures positively affect the performance of a reasoning system, on the one hand, and have a general basis since proficiently applicable for the optimization of generic knowledge bases, on the other one. At present, many approaches on the optimization of rule-based systems have been published that describe and evaluate how to dynamically revise the structure of the knowledge base in order to reduce the complexity involved in its evaluation without altering the meaning (Ishida, 1988; Özacar et al., 2007; Ünalır et al., 2005; Mustafa, 2003). These approaches are predominantly devised to dynamically reorder the condition elements of a rule and change the portion of knowledge base involved during its reasoning tests, so as to reduce both the size of intermediate results to instantiate and the number of reasoning tests to perform before invalidating an intermediate result. Most of them apply well-known heuristics for re-ordering the condition elements of a rule according to criteria calculated and employed at runtime, during the rule evaluation. In detail, a first typical heuristic consists in sorting the conditions according to their restrictive power and examining the most restrictive ones before, since the evaluation of conditions having less matching facts allows decreasing the size of the following rule instantiations (Ishida, 1988; Özacar et al., 2007). A second heuristic described in literature consists in dynamically identifying conditions with rarely used predicates and evaluating them before, since they have less possibility to change their matching facts (Ünalır et al., 2005). Finally, some heuristics have been proposed to order condition elements in the rules. Since the identification of such an ordering can be very thorny due to a large number of possible ways to order condition elements in the rules, also genetic algorithms have been proposed to achieve this goal and obtain a (near) optimal production system with respect to execution time (Mustafa, 2003). Summarizing, all these heuristics are employed during the rule execution, they can be contemporary applied to automatically order condition elements and, in many cases, they may also conflict with each other (Arman, 2013).
Acknowledgements This work has been partially supported by the Italian project “Smart Health 2.0” funded by the Italian Ministry of Education, University, and Research (MIUR). References Akter, S., et al., 2013. Modelling the impact of mHealth service quality on satisfaction, continuance and quality of life. Behav. Inf. Technol. 32 (12), 1225–1241. Arman, N.A., 2013. Improving rule base quality to enhance production systems performance. Int. J. Intell. Sci. 3 (1), 1–4. Brant, D. A., & Miranker, D. P. Index support for rule activation. In ACM SIGMOD Record, vol. 22, no. 2, pp. 42-48. June 1993. Carroll, J.J., Dickinson, I., Dollin, C., Reynolds, D., Seaborne, A., Wilkinson, K. 2004. Jena: implementing the semantic web recommendations. In Proc. of the 13th Int. World Wide Web conference on Alternate track, New York, USA, pp. 74–83 D’Aquin, M., Nikolov, A., and Motta, E. How much semantic data on small devices? In Proceedings of the International Conference on Knowledge Engineering and Management by the Masses, pp. 565–575, 2010. Forgy, C.L. On the Efficient Implementation of Production Systems. PhD thesis, Department of Computer Science, Carnegie-Mellon University, Pittsburgh, February 1979. Forgy, C.L. Rete: A fast algorithm for the many pattern/many object pattern match problem. Artificial Intelligence, vol. 19, pp.17-37, 1982. Graham, K., Carroll, J. J. 2006. Resource description framework (RDF): Concepts and abstract syntax. Hamine, S., et al., 2015. Impact of mHealth chronic disease management on treatment adherence and patient outcomes: a systematic review. J. Med. Internet Res. 17 (2). Knight, E., Stuckey, M.I., Petrella, R.J., 2014. Health promotion through primary care: enhancing self-management with activity prescription and mHealth. Physician Sportsmed. 42 (3), 90–99. Ishida, T. 1988. Optimizing Rules in Production System Programs. In AAAI, pp. 699–704 Malvey, D.M., Slovensky, D.J., 2014. MHealth: Transforming Healthcare. Springer. Minutolo, A., Esposito, M., De Pietro, G. Design and validation of a light-weight reasoning system to support remote health monitoring applications. Engineering Applications of Artificial Intelligence, 41, 232-248, 2015. Miranker, D.P. TREAT: A New and Efficient Match Algorithm for AI Production Systems. PhD dissertation, Columbia Univ., 1987.
120
Engineering Applications of Artificial Intelligence 59 (2017) 103–121
A. Minutolo et al.
Tokyo, Japan. Ünalır, M. O., Özacar, T., Öztürk, Ö. Reordering query and rule patterns for query answering in a rete-based inference engine. In International Conference on Web Information Systems Engineering, pp. 255-265, 2005, Springer Berlin Heidelberg. Weert, P.V. Efficient Lazy Evaluation of Rule-Based Programs. IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 11, pp. 1521-1534, Nov. 2010. World Health Organization (WHO). mHealth: New horizons for health through mobile technologies. Global Observatory for eHealth series, volume 3, Geneva, Switzerland, 2011.
Miranker, D.P., Brant, D.A., Lofaso, B.J., Gadbois, D. On the Performance of Lazy Matching in Production Systems. In AAAI, vol. 90, pp. 685-692, 1990. Mustafa, W., 2003. Optimization of production systems using genetic algorithms. Int. J. Comput. Intell. Appl. 3 (03), 233–248. Özacar, T., Öztürk, Ö., Ünalir, M.O., 2007. Optimizing a rete-based inference engine using a hybrid heuristic and pyramid based indexes on ontological data. J. Comput. 2 (4), 41–48. Phadke, M.S., 1989. Quality Engineering Using Robust Design. Prentice Hall. Englewood Cliffs, NJ. Taguchi, G., 1986. Introduction to Quality Engineering. Asian Productivity Organization,
121