Establishment of rule dictionary for efficient XACML policy management

Establishment of rule dictionary for efficient XACML policy management

Knowledge-Based Systems xxx (xxxx) xxx Contents lists available at ScienceDirect Knowledge-Based Systems journal homepage: www.elsevier.com/locate/k...

1MB Sizes 0 Downloads 17 Views

Knowledge-Based Systems xxx (xxxx) xxx

Contents lists available at ScienceDirect

Knowledge-Based Systems journal homepage: www.elsevier.com/locate/knosys

Establishment of rule dictionary for efficient XACML policy management ∗

Fan Deng a , , Liyong Zhang b , Changyu Zhang b , Hao Ban b , Chang Wan b , Minghao Shi b , Chao Chen b , Enti Zhang b a b

School of Computer Science and Technology, Xi’an University of Science and Technology, Xi’an 710054, China School of Software, Xidian University, Xi’an 710071, China

article

info

Article history: Received 14 November 2018 Received in revised form 11 January 2019 Accepted 17 March 2019 Available online xxxx Keywords: HashMap Rule dictionary Bitmap storage Policy Decision Point (PDP) Evaluation performance

a b s t r a c t In order to improve the evaluation efficiency of the XACML policy, the storage principle of the rule dictionary is analyzed and the XACML policy evaluation engine XDPMOE is proposed. This is a new XACML policy management optimization scheme based on bitmap storage and HashMap. First of all, we acquire numeralization policy set, establish the rule dictionary based on the array sequential storage structure, and use the rule dictionary to quickly index the policy rules to improve the efficiency of the policy evaluation. Secondly, bitmaps are used to store policy set, which reduces the space complexity of the engine. By simulating the arrival of the access request, the experimental results show that (1) By reordering the policy set, the time spent by the policy set in storing the bitmap is greatly reduced, and that (2) The average evaluation efficiency of XDPMOE has significantly improved compared to the Sun PDP, HPEngine and XEngine. The hash matching algorithm based on bitmap storage not only takes up less storage space, but also can improve the matching efficiency to a great extent. © 2019 Elsevier B.V. All rights reserved.

1. Introduction To increase software reusability, reduce the difficulty of software development and improve the reliability of software systems existing software system development process, Service Oriented Architecture (SOA) has been introduced [1]. Besides, in the SOA environment, access control is an important part of security requirements [2]. Access control means that when a user performs an access operation to a network or an information system, the system controls or protects access to published resources through technologies such as identity authentication and dynamic authorization. When studying authorized access control, people increasingly use strategies to describe the security needs of networks and information systems. In the access control model, the Policy Decision Point (PDP) is an important part of it. The operational efficiency of the authorized service system is closely related to the evaluation performance of PDP [3]. As the scale of loaded policy sets increases, the evaluation performance of PDP will continue to decline. The major bottleneck for improving the evaluation performance of PDP was flexible structure of large-scale complex policy set in the authorized ∗ Corresponding author. E-mail address: [email protected] (F. Deng).

service system. Recently, it has become a hot issue to study the efficient evaluation methods of large-scale complex policy set in PDP for constructing an efficient authorization service system. In order to meet the management needs of large-scale information systems, many organizations implement access control by managing the networks and distributed systems within the policies. Extensible Access Control Markup Language (XACML) which has been widely accepted and recognized to describe policy is generally used in distributed application systems in SOA environments [4]. However, the core specification of XACML believes that all policies are credible, which leads to the result that XACML cannot effectively detect conflicts and redundancy in the policy set itself [5]. As a consequence, in the XACML language, rule combined algorithms and strategy combined algorithms are defined to solve conflicts and redundancy problems. On the one hand, conflicts and redundancy reduce the speed of matching within the policy set. On the other hand, the matching method within the policy set limits the matching speed of a new request. Up to now, much work has been done to eliminate conflicts and redundancy within the policy set, and we also have discussed conflict and redundancy issues in two articles [6] and [7] before. Therefore, we hope to propose new matching methods to improve the efficiency of matching. In this paper, we propose a hash matching algorithm based on bitmap storage to improve matching efficiency [8]. This paper makes three major contributions as follows.

https://doi.org/10.1016/j.knosys.2019.03.015 0950-7051/© 2019 Elsevier B.V. All rights reserved.

Please cite this article as: F. Deng, L. Zhang, C. Zhang et al., Establishment of rule dictionary for efficient XACML policy management, Knowledge-Based Systems (2019), https://doi.org/10.1016/j.knosys.2019.03.015.

2

F. Deng, L. Zhang, C. Zhang et al. / Knowledge-Based Systems xxx (xxxx) xxx

(1) For each attribute in the policy set, we propose to use the key–value pairs in the HashMap to store them to increase the access speed in the numerical process and further reduce the time cost in the preprocessing. (2) We use rule dictionary for simulating four dimensional arrays and look for attributes placed in the dictionary through the dictionary directory. After the sum of four attributes value, the XDPMOE (XiDian Policy Matching Optimization Engine), a machine that synthesizes all the methods in this paper, can uniquely determine the location of the corresponding policies for the request in the policy set. It reduces the time overhead of access effect. (3) Bitmap structures are used to store policy sets. If i is set to the relative position of the policy R after the content index, the position of the policy R in the bit space is B2i B2i+1 . Each policy in the policy set only takes up two bits space, which reduces average storage space for effects. The remainder of this paper is organized as follows. Section 2 illustrates related research that has been conducted at the PDPs and XACML. A novel policy matching optimization engine is proposed in Section 3. Section 4 shows some specific method used in our paper. In Section 5, we explain the method and use an example to explain the process. The specific concepts and definitions of each innovation are detailed in Section 6. Section 7 shows experimental results of the evaluation performance improvement of PDPs. Finally, Section 8 presents several conclusions and direction of our future work. 2. Related work Several efforts have been mainly devoted to three directions. Aiming at eliminating redundancy and conflicts in the existence of large-scale complex policy sets. 2.1. Elimination of policy redundancy and conflict In the field of strategic conflict research, St-Martin et al. [9] proposed a policy conflict detection algorithm, which converts the XACML policy into Coq code, a language format for interactive theorem prover, converts it into Ocaml which is a generalpurpose programming language with faster compilation speed and execution speed. After that, it runs on the analyzer, and finally proves the correctness of the algorithm in the Coq verification management system. Sarkis et al. [10] pointed out that direct policy conflicts can be detected by analyzing the coverage of the attribute elements of the policy, and indirect policy conflicts can be detected by analyzing the implicit relationships between the Subject, Resources, and Action attributes of the two policies. Stepien et al. [11] proposed a method based on logic programming language expert system. This method uses constrained logic programming technology, can effectively adapt to the hierarchical logic of XACML policy, and can avoid repeated rules comparison and finally realize static conflict and dynamic conflict detection. In the field of policy redundancy research, Shaikh et al. [12] pointed out that some rules may contain variable-length Boolean expressions. As a result, the data classification algorithm is used to sort the attribute information of the policy and formalize the Boolean expressions and construct a decision tree by parsing the policy set and propose strategy redundancy detection algorithm. Jebbaoui et al. [13] converted XACML attribute elements into set to represent automatically, reduced the complexity of the policy, and then used the semantic verification method to analyze the meaning of the rules in the policy set to detect the redundancy between the rules.

2.2. Optimization of large-scale complex policy sets Aiming at optimizing for large-scale complex policy sets: Lin et al. [14] proposed a policy similarity measure as a light-weight ranking approach to help one party quickly locate parties with potentially similar policies. In particular, given a policy P, the similarity measure assigns a ranking (similarity score) to each policy compared with P. It formally defines the measure by taking into account various factors and prove several important properties of the measure. Mourad et al. [15] pointed out a UML configuration file, uses the XACML policy specification driven by the system model to solve the complex policy design problem, and based on the set of policies semantics, optimize the specified policy before the system runs. Turkmen et al. [16] utilized a novel strategy evaluation method called Satisfiability Modulus Theory (SMT). This method can perform real-time verification during authorization query response by formally analyzing and representing the XACML policy as an SMT formula to improve the evaluation performance of policy decision points. 2.3. Adoption of distributed authorization model Aiming at adopting an efficient distributed authorization model: In the traditional centralized authorization model, the XACML policy evaluation engine contains only one policy decision point responsible for responding to user access requests. When the number of policies included in the policy loaded by the policy decision point increases gradually, its evaluation performance will greatly decrease. At the same time, if the number of users who initiate requests in parallel is large, the time for users to access resources will increase greatly [17]. Díaz-López et al. [18] proposed an efficient distributed policy management method. This method redefines the default XACML architecture, uses Master/SlavePAP for communication, uses meta-policy to define policy privilege, and uses extended SAML to protect policies. Wang et al. [19] studied the topological characteristics of different types of policies and revealed the inherent complexity of security policies. After decomposing the policy set into multiple policy decision points in a greedy manner, this method can be applied to data centers and network topology for service providers. In a summary, most of the existing policies for improving the PDP have some drawbacks. Policies aimed at conflict resolution and redundancy detection and elimination for large-scale complex policy sets are inefficient. Large-scale complex policy set optimization algorithm has high space–time complexity and the method to build a distributed service system based on existing policy decomposition has greater limitations. When optimizing large-scale complex policy set, we reduce the space–time complexity by using the methods of rule dictionary content matching, HashMap key–value pair matching and bitmap storage [20]. 3. Form matching optimization engine XDPMOE We proposed based on hash search and bitmap storage engine termed the XDPMOE (XiDian Policy Matching Optimization Engine) is shown in Fig. 1. The XDPMOE is able to respond to any request in a short time and a small space. When a request arrives, the XDPMOE will quickly get the result through the following steps.

• Acquire numeralization request and get its value. • Find the content and get the subscripts. • Access the rule dictionary by subscript and return the result, which is output by XDPMOE.

Please cite this article as: F. Deng, L. Zhang, C. Zhang et al., Establishment of rule dictionary for efficient XACML policy management, Knowledge-Based Systems (2019), https://doi.org/10.1016/j.knosys.2019.03.015.

F. Deng, L. Zhang, C. Zhang et al. / Knowledge-Based Systems xxx (xxxx) xxx

3

Fig. 1. Policy matching optimization engine XDPMOE.

Fig. 3. Structure of policy set.

Fig. 2. Process overview diagram.

Definition 1. The rule dictionary is divided into the content part and the text part. The content portion consists of four subdirectories that store several key–value pairs. The text part stores the index subscript and its corresponding effect. We read the existing policy set into the XDPMOE and store the policy set in bitmap form in address space. Each new request can be directly received or rejected by the PDP through the XDPMOE. If the corresponding policy for the request is not in the PDP, then the original request will be returned and the request will continue to look for the next PDP. The rule dictionary will be further explained in Section 6.4. 4. Approach overview Our main work includes the policy preprocessing and matching optimization implemented by the rule dictionary as shown in Fig. 2. XDPMOE improves PDP performance by optimizing storage and querying processes. The first step in the preprocessing process is to acquire numeralization policy set and store the numerical results as key–value pairs. Since a series of pairs of numeralization value and the original attribute value stored in a hash table, when the result is queried, we can get the result in linear time by calling the hash table. In the first step, the input is the policy set and the output is the numerical result. The second step of preprocessing process is to store the digitized results in sequence in the rule dictionary. The dictionary is divided into the content part and the text part. The content section stores the numerical values and relative positions of each policy. XDEMOP performs a summation calculation on each subattribute part after numeralization. After subscripts are obtained, the corresponding result of the policy is directly accessed in the

text part of the dictionary. In the second step, you can find the position of its corresponding effect in the text of the dictionary by importing a numerical policy set. In the text of the dictionary, we propose a method to improve internal matching efficiency by means of bit storage and shift operation on the underlying computer. This method is to optimize the large-scale complex policy set. However, unlike previous approaches to this direction, the method to use bitmap storage and establish rule dictionary not only reduce the time cost of matching, but also reduce the space occupation. 5. Preliminary notions and an illustrative example XACML is structurally clear in policy expression [5]. It represents security rules as the collection of attribute values of the four main attributes of Subject, Resource, Action, and Condition. The Subject, Resource, Action are all belong to Target attribute. Moreover, the Target and Condition are the same level in rules. The policy set structure is shown in Fig. 3. A policy set consists of one or more rules. The request has the same structure of the rule and uniquely matches with one of the rules. If the request matches one of the rules, it will return its effect. Otherwise, the request will be returned by PDP. For example, which we will use throughout the paper, we use it to explain the whole process about how we optimize the matching process of requests and policy sets. Before we accept the request, we will preprocess the policy set. After extracting information for each policy in the policy set, we get a set of information. Take Fig. 3 as an instance, we get . The establishment of the content is based on the order in which each attribute appears. With the establishment of the content, the numerical processing of rules is also synchronizing. After numerical processing, we can find that the storage form of the policy becomes . After the numerical processing, we reorder the policy set

Please cite this article as: F. Deng, L. Zhang, C. Zhang et al., Establishment of rule dictionary for efficient XACML policy management, Knowledge-Based Systems (2019), https://doi.org/10.1016/j.knosys.2019.03.015.

4

F. Deng, L. Zhang, C. Zhang et al. / Knowledge-Based Systems xxx (xxxx) xxx

Table 1 Extracted information. Permit/Deny Permit/Deny Permit/Deny ...

Subject0 Subject1 Subject2 ...

Table 2 Numerical values and storage of policy set. Resource0 Resource1 Resource2 ...

Action0 Action1 Action2 ...

Condition0 Condition1 Condition2 ...

in hierarchical order. In order to improve the search efficiency, the dictionary is described by the Cartesian product of Subject, Resource, Action, and Condition [21]. The effects of the original policy set will be filled in. If the rule does not exist in the policy set, its effect will be replaced by a hole. (The concept of hole will be explained in Section 6.4.2.) The method that searching by content and dictionary can be seen as storing effect in a multidimensional space [22]. We do not need to traverse all policy sets and only need to know its position in four directions to uniquely determine the result. For the received request, we first normalize it, then calculate its position in the dictionary, and finally return the result. Using this method can greatly reduce the time complexity of the matching between request and policy, especially when the policies in the policy set are distributed more homogeneously. 6. Policy preprocessing and matching optimization The process of responding to pending requests is divided into two parts: preprocessing and matching optimization. The preprocessing is targeted at the policy set, whose time is not included in the request processing time. 6.1. Policy normalization In order to improve the efficiency of policy set matching and reduce storage costs, a policy normalization process is executed before establishing a rule dictionary. Policy normalization consists of two operations: (1) extracting policy set information; and (2) normalizing attributes of the policy set. Because there is more redundant data between different policies in the same policy set, we only extract Rule Effect and Attribute Value in four main attributes. Each policy has a unique combination of values for these four properties, so each combination of them can uniquely represent a policy. Simplified policy set structure can refer to Fig. 3. Got the information which need to be extracted as Table 1. In the operation of normalizing attributes of the policy set, we use the Hash Algorithm to process the policy set. By reading the Rules in the policy set, all attribute values in four main attributes convert to numerical values in order. The attribute value and numerical value will be stored in HashMap [23]. HashMap based on the interface of hashtable and get the value of hashcode by the key. The specific process of numerical processing and HashMap storage is shown in Algorithm 1. In our example, we can get the numerical value through attribute value. The HashMap uses the bucket structure to store values and distribute the elements appropriately between buckets, which provides stable performance for basic operations and is good for the modification in attribute set. When receiving a new key, the algorithm in the hashcode can quickly determine which bucket it should be stored in and get its value. If the key is not in the HashMap, the hashcode will generate the map.entry object internally and store the key and value in the bucket. That is, when a new request arrives, we can use the constant time cost to get the numerical value of each of its attributes. In the process of rules in policy set, we split XACML rules into four short strings. Although the time complexity of the hash function used in rule matching is constant, strings that are too

Subject

Resource

Action

Condition

Effect

0 1 0 2 2 0 1 1 0 ...

0 0 0 0 1 1 1 1 1 ...

0 1 2 3 0 1 2 3 4 ...

0 0 0 0 0 0 0 0 0 ...

0 0 0 0 0 1 1 1 1 ...

long and have redundancy can cause the hash function to have a higher collision rate when computing hashed addresses. By dividing it into short strings, it will greatly reduce the capacity of the HashMap, accelerate the speed under matching and reduce the occurrence of hashed address conflicts. 6.2. Policy set reorder After normalizing the policy set, we reorder policy sets according to hierarchical relationships. By reordering the rules in policy set, we can directly access the effect through its subscript index. In this paper, we use Timsort [24] to reorder the policy set. Timsort is a sorting algorithm that combines merge sort and insertion sort. Timsort finds the blocks in the data that have been sorted, the partitions, each partition is called a run, and then these rules will be merged by the rule. According to the theory of informatics, the time complexity of comparison sort is not faster than O(nlgn) under average conditions. However, if there are already ordered numbers in the array to be sorted, the time complexity will be less than O(nlgn). Timsort is a stable sorting algorithm. In the worst case, it needs a temporary space of n/2. In the best case, it only needs a small temporary storage space. 6.3. Multidimensional arrays According to the attribute values of Subject, Resource, Action, and Condition in each rule, effect (Permit or Deny) can be stored in a four-dimensional space in an array-like manner [25]. Each request can be based on four attribute values uniquely determine the storage location of the effect. With array storage, data can be accessed directly based on subscripts [26]. When the array x is one-dimensional, setting the starting position of the array to S0 and the size of the array is n, then used Si to indicate the position of the i th element xi (0 ⩽ i < n) in the array, as shown in Eq. (1). Si = S0 + i

(1)

When the array x is four-dimensional, setting the starting position of the array to S0 and the size of the four dimensions of the array is n1 , n2 , n3 , n4 , then used Sijkl to indicate the position of the element xijkl (0 ⩽ i < n1 , 0 ⩽ j < n2 , 0 ⩽ k < n3 , 0 ⩽ l < n4 ) in the array, as shown in Eq. (2). Sijkl = S0 + n2 · n3 · n4 · i + n3 · n4 · j + n4 · k + l

(2)

Extracted the numerical value obtained from the previous step to replace the attribute value of rules and used 0 for Deny, 1 for Permit, the partial policy set we have replaced is shown in Table 2. Divided the contents of the policy set according to the Subject, we extract the policy set whose current Subject value is 0 and get the structure of the three-dimensional array shown in Fig. 4.

Please cite this article as: F. Deng, L. Zhang, C. Zhang et al., Establishment of rule dictionary for efficient XACML policy management, Knowledge-Based Systems (2019), https://doi.org/10.1016/j.knosys.2019.03.015.

F. Deng, L. Zhang, C. Zhang et al. / Knowledge-Based Systems xxx (xxxx) xxx

5

Resource, Action and Condition. The structure of the four subdirectories is the same. Two values are saved in each child content: Attribute value, Relative position. The specific structure is shown in Fig. 6. Definition 2. Relative position refers to the position of a specific attribute in a subcollection (the starting position is 0) after the rules are reordered. In numerical terms, the relative position is equal to the numerical value. Because the dictionary structure is divided hierarchically by Subject, Resource, Action and Condition, we can solve its index subscript to determine its effect based on the relative position of each attribute value stored in the content.

Fig. 4. Three-dimensional array when Subject = 0.

After merging all the three-dimensional arrays classified according to the Subject, the corresponding four-dimensional array of the policy set can be obtained. Four-dimensional array structure diagram corresponding to Fig. 5. A four-dimensional array is not the final form of storage, but it is a convenient way to understand the internal structure of a rule dictionary. 6.4. Rule dictionary Although the elements in the array are contiguous, we hope to access the location of a rule in the four-dimensions array and its effect without traversal. In order to solve these problems, a dictionary is used to imitate a four-dimensional array.

6.4.2. Text section of rule dictionary The content stored in the text of the dictionary is the index subscript and its effect. The structure is shown in Fig. 7. The data structure of the text part is based on an array implementation, which is a contiguous memory space (the initial index is 0). The content stored in the array is the final effect. The text part of the dictionary can be abstractly represented as a four-dimensional array. First, all the rules are divided into |S | different Subject dimensions according to the newly order, and then each different Subject dimension is divided into |R| different resource dimensions according to the newly order. Resource and action are also divided into |A| and |C | different action dimensions and condition dimensions according to the newly order. The order of all possible combinations of policy sets after grouping is the storage order of effects in the dictionary text, as shown in Fig. 8. The required memory space size is M, the Cartesian product radix of the four attribute sets, as shown in Eq. (3). M = |S | × |R| × |A| × |C |

6.4.1. Content section of rule dictionary The content part of dictionary simplifies the dictionary structure and stores the data obtained from the normalization process. The content of dictionary consists of four subdirectories: Subject,

(3)

Definition 3. Hole refers to the space overhead created because of the dictionary structure needs, but it does not actually exist in the policy set.

Please cite this article as: F. Deng, L. Zhang, C. Zhang et al., Establishment of rule dictionary for efficient XACML policy management, Knowledge-Based Systems (2019), https://doi.org/10.1016/j.knosys.2019.03.015.

6

F. Deng, L. Zhang, C. Zhang et al. / Knowledge-Based Systems xxx (xxxx) xxx

Fig. 5. Four-dimensional array structure diagram.

Fig. 6. Structure of content.

Fig. 7. Structure of text.

However, excessive space overhead exists by using array storage policy sets. When Subject, Resource, Action, and Condition are n1 , n2 , n3 , n4 its space complexity is O(n4 ). If the total ∏4 number of rules in the policy set is P, the number of holes is i=1 ni − P. 6.5. Bitmap storage Bitmap [8] is a data structure that represents a dense set in a finite domain. Bitmap index has the advantages of simple storage structure, small footprint and fast query, especially suitable for read-only mass data. In order to speed up and optimize the query process, we introduce a bitmap index [27] to store and query the policy set. In Bitmap storage, memory need to open up the bit space in order to save all the rules in the policy set. In order to meet the space required for bitmap storage, used |S | to represent the number of types of the Subject, |R| to represent the number of types of the Resource, |A| to represent the number of types of the Action and |C | to represent the number of types of the Condition. The calculation of the bit space size N is shown in Eq. (4). N = 2 × |S | × |R| × |A| × |C |

(4)

The initialization process can be performed after enough bitmap storage space is available. Initializing the bit space requires knowing where all the rules in the policy set occupy the

Fig. 8. Dictionary storage order in text.

bit space. We stipulate that if i is set to the relative position of the rule R after the content index, the position of the rule R in the bit space is B2i B2i+1 . The B2i B2i+1 two bits are used to represent the effect of the rule R. If B2i B2i+1 is 01, the effect is Permit, and B2i B2i+1 is 10, the effect is Deny. The initialization process is as follows.

• Set all positions in the bit space to 0, as shown in Fig. 9. • For each rule, the position B2i B2i+1 of the rule in the bit space is calculated using the rule attributes and the content.

Please cite this article as: F. Deng, L. Zhang, C. Zhang et al., Establishment of rule dictionary for efficient XACML policy management, Knowledge-Based Systems (2019), https://doi.org/10.1016/j.knosys.2019.03.015.

F. Deng, L. Zhang, C. Zhang et al. / Knowledge-Based Systems xxx (xxxx) xxx

7

Fig. 9. Bitmap initialization.

Fig. 10. After bitmap adjustment.

If the rule Effect is Permit, B2i+1 is set to 1, and if the Rule Effect is Deny, B2i is set to 1. The results are shown in Fig. 10. When the request arrives, we can uniquely determine the position B2i B2i+1 of its rules based on the requested attribute values and content information in Bitmap. If it is 01/10, the request will return Permit/Deny. If it is 00, it indicates that the policy set does not have a rule that meets the request and will return to the intermediate state. Take the policy set shown in Fig. 3. as an example. If the relative positions of R1 , R2 , and R3 obtained after indexing from content are 0, 1 and 3, the results are as follows.

• B0 B1 is 01. The existence of the rule meets the request R1 and returns Permit.

• B2 B3 is 10. The existence of the rule meets the request R2 and returns Deny.

• B6 B7 is 00. The rule corresponding to the request R3 does not exist and returns to the intermediate state. 6.6. Dictionary lookup and ontology matching So far, the methods and basic principles used in each part have been clarified in the above. The basic process consists of three parts as shown in Fig. 11. First, we read in the policy rules to be matched in policy set and extract the attribute values of each attribute. Attribute values and numerical values are stored as key–value pairs in the HashMap. In the second step, the part on calculating the rule position is the core of this algorithm. According to the method of directly determining the rule result from the subscript in the four-dimensional array, the constantlevel time access to the policy set can be realized and the storage space can be calculated from the Cartesian product radix. In fact, policy sets are stored in bit spaces in the form of bitmaps. Only two bits are required to store different results to reduce storage space usage. When an authorization request arrives, the four attributes of the Subject, Resource, Action, and Condition of the request are extracted and hashed in the dictionary content. We can get the numeric value corresponding to its attribute value, which is also its relative storage location. Supposed the four attributes corresponding to the value of v1 , v2 , v3 and v4 . According to Eq. (5), the subscript index of the corresponding rule of the request in the text of the dictionary can be obtained. index = v alue1 × |R| × |A| × |C | + v alue2 × |A| × |C |

+ v alue3 × |C | + v alue4

(5)

According to the index, the final corresponding result of the request can be directly obtained.

7. Experiments results and analysis In order to evaluate the performance improvement of XDPMOE in the process of matching requests, the test policies and generation of test requests are first introduced. We do several experiments as follows. (1) First, we test the preprocessing section. The efficiency of reordering is measured by comparing the change in the storage time of the bitmap in the text of the rule dictionary of the original policy set and the reordered policy set. (2) Comparisons of the evaluation performances of PDPs in XDPMOE with that of PDPs in the Sun PDP, XEngine and HPEngine are made. The Sun PDP is a widely used policy decision point and it is used to evaluate requests as the decision engine in our experiments [28]. XEngine is a policy evaluation engine that can convert text XACML policies to numerical ones and convert numerical policies with complex structures into numerical ones with a normalized structure [29]. HPEngine is a policy evaluation engine which first uses a statistical analysis that based policy optimization mechanism to dynamically refine the numerical form of the policy and then caches the frequently invoked properties, policies, and request result pairs [30]. 7.1. Test policies We select policies from practical systems to simulate actual application scenes. Three adopted XACML access control policies in practical systems are as follows [31–33]. (1) Library Management System (LMS) [31]: the LMS provides access control policies by which a public library can use web services to manage books; (2) Virtual Meeting System (VMS) [32]: the VMS provides access control policies by which web conference services can be managed; (3) Auction sale management system (ASMS) [33]: the ASMS provides access control policies by which items can be bought or sold online. The policy of the LMS contains 720 rules, the VMS 945 rules, and the ASMS 1760 rules. In order to facilitate the comparison of the efficiency of each method, we have to extend the policy set of LMS, VMS and ASMS by random combination method to enlarge the gap between results. According to Cartesian Products under Subject, Resource, Action and Condition, we build new rules and add them to the original policy. Finally, the number of policy rules contained in LMS, VMS and ASMS was expanded to 3000, 6000 and 9000.

Please cite this article as: F. Deng, L. Zhang, C. Zhang et al., Establishment of rule dictionary for efficient XACML policy management, Knowledge-Based Systems (2019), https://doi.org/10.1016/j.knosys.2019.03.015.

8

F. Deng, L. Zhang, C. Zhang et al. / Knowledge-Based Systems xxx (xxxx) xxx

Fig. 11. Optimization algorithm matching process.

Fig. 12. Time cost of storage before and after reordering.

Fig. 13. Comparison in evaluation performance of LMS.

Fig. 15. Comparison in evaluation performance of ASMS.

Wei et al. [35] proposed that access requests can be automatically generated in order to test the correctness of the PDP and configured policies. They pointed out that the Context Shema which is defined by the XML Schema of the XACML describes all structures of the access requests that might be accepted by the PDP, or all valid input requests. This paper indicates that their developed XCREATE can generate possible structures of access requests according to the Context Shema of the XACML. The policy analyzer obtains possible input values of every attribute from the policy. The policy manager adopts the method for random allocation to distribute the obtained input values into structures of access requests. Another test program is Simple Combinatorial, which can generate access requests according to all the possible combinations of attribute values of Subject, Action, Resource and Condition in the XACML policies. According to the practical requirement of the performance test, the combination of Change-Impact, Context Shema and Simple Combinatorial is adopted to simulate the actual access requests in this paper. 7.3. Performance tests and comparisons

Fig. 14. Comparison in evaluation performance of VMS.

7.2. Generation of test requests In order to automatically generate access requests that conform to Change-Impact, Ni et al. [34] propose that policies are analyzed by Change-Impact. The purpose is to increase the coverage of the test. The main idea is that conflict detection tools obtain policies or rules that can evaluate the same request based on different policies or results in the same policy that can make different rules inconsistent. Moreover, they can construct and related access requests to test according to conflicting policies or rules.

At present, Sun PDP [28] is a generally used policy decision point [36]. The access requests can be selectively evaluated according to the internal rules matching mechanism [37]. Since the Sun PDP is open source and the most widely deployed implementation of XACML evaluation engines, which has become an industry standard. In the following experiments, we choose the Sun PDP to evaluate requests as a decision engine. XEngine [29] can convert a textual XACML policy to a numerical policy and can convert a numerical policy from complex structures to normalized structure. In addition, as a policy evaluation engine, XEngine can to translate numerical policies into tree data structures and handle requests efficiently. HPEngine [30] is a policy optimization engine. Firstly, it uses a statistical analysis-based policies optimization mechanism to dynamically refine the policy, and transforms the refined XACML policy from a text form to a numerical form. Moreover, it used a statistical analysis mechanism to create caches for attributes, policies, and request results which are frequently invoked to achieve the goal of reducing the size of the policy and optimizing the matching method.

Please cite this article as: F. Deng, L. Zhang, C. Zhang et al., Establishment of rule dictionary for efficient XACML policy management, Knowledge-Based Systems (2019), https://doi.org/10.1016/j.knosys.2019.03.015.

F. Deng, L. Zhang, C. Zhang et al. / Knowledge-Based Systems xxx (xxxx) xxx

9

Table 3 Evaluation time (ms) of LMS. Resquest

1000

2000

3000

4000

5000

6000

7000

8000

9000

10 000

Sun PDP HPEngine XEngine XDPMOE

917 76.2 93.8 0.52

1811.2 150.5 250.4 0.57

2688.2 223.7 309.2 0.74

3661.2 304.9 413 0.94

4470.6 373.5 439.8 1.1

5427.6 451.8 525.6 1.3

6346.4 529.4 569.8 1.5

7181.6 597.8 610 1.8

8261.2 688.9 720.4 2.0

9111.2 758.9 760.2 2.2

Table 4 Evaluation time (ms) of VMS. Resquest

1000

2000

3000

4000

5000

6000

7000

8000

9000

10 000

Sun PDP HPEngine XEngine XDPMOE

1936.8 68.9 128.2 0.50

3894.8 139.3 183 0.54

5719.6 203.9 280 0.78

7836.2 280.0 379.8 1.0

9529 340.4 397 1.1

11 809.8 422.8 462.8 1.4

13 691.2 489.1 529.8 1.6

16 016.6 569.6 585.2 1.8

17 604.8 627.9 681.6 2.0

18 993.4 678.5 715 2.4

Table 5 Evaluation time (ms) of ASMS. Resquest

1000

2000

3000

4000

5000

6000

7000

8000

9000

10 000

Sun PDP HPEngine XEngine XDPMOE

2249.6 2.0 160.2 0.49

4570.2 4.1 221.2 0.52

6688.8 5.9 316.6 0.74

9230.6 8.2 372.6 1.1

11 112.4 9.9 446.8 1.1

13 717 12.2 492.4 1.3

15 242 13.6 574.4 1.6

17 648.6 15.7 657.2 1.8

20 235.6 18.0 728.4 2.0

22 245.2 19.8 786.2 2.4

7.3.1. Storage time In our experiment, we use Sun PDP, XEngine and HPEngine which act as a decision engine to evaluate the request in subsequent experiments. In the preprocessing process, we introduced the reordering process to reduce the time required to store the policy set. The experimental results are shown in Fig. 12. In Fig. 12, we observe that when the policy sets are reordered, the time that the policy set stores bitmaps in the text part of the rule dictionary will be greatly reduced. Compared to the unordered policy set, the reordered policy set can calculate the effect storage position according to |S | , |R| , |A| , and |C | corresponding to each rule, and does not need to be matched one by one in the original policy set. As a consequence, the process of creating a bitmap can be faster. 7.3.2. Evaluation time After verifying that the reordering process can drastically reduce the time consumption of creating bitmaps, we compare the results of XDPMOE with Sun PDPs, XEngine and HPEngine to verify that XDPMOE can effectively improve PDP evaluation performance. We randomly generated 1000, 2000, . . . , 10,000 access requests to measure the PDP evaluation time. Figs. 13–15 shows how the engine’s evaluation time varies with the number of access requests under three different policy sets. In Figs. 13–15, we observe that

• In the case of LMS, VMS and ASMS, the evaluating efficiency of XDPMOE has been improved greatly compared to the other three methods, especially in LMS and VMS. • Compared to the Sun PDP, XEngine and HPEngine, the growth rate of the evaluation time of XDPMOE is less than others. It is worth noting that from Figs. 13–15, we can see that HPEngine’s performance changes more obviously for different policy sets. For policy set which has more Subjects, such as ASMS, its performance is significantly improved compared to other engines. In order to display changes more intuitively, we conducted further comparisons for the ASMS policy set. When the number of access requests reaches 1000, 2000, . . . , 10 000, the average evaluation time of the Sun PDP, XEngine, HPEngine and XDPMOE is shown in Tables 3–5. From Tables 3–5, we can observe that.

• The average time required for XDPMOE to evaluate access requests is lower than that of the Sun PDP, XEngine and HPEngine. • Faced with different policy sets, XDPMOE’s average evaluation time for the arrival of requests has changed little and is relatively stable compared to the Sun PDP, XEngine and HPEngine. • When the number of access requests reaches 10 000, the average evaluation time of XDPMOE compared to Sun PDP in LMS, VMS, and ASMS was reduced by 1/4141, 1/7914 and 1/9296 respectively. 8. Conclusions and future work Policy evaluation in XACML has been introduced in this paper, which is applied to many occasions and there is a bottleneck in work efficiency and processing speed. In order to solve this problem, the policy matching optimization engine termed the XDPMOE is presented in this paper. This engine can optimize the matching process of the request by preprocessing the policy set. Rule dictionary is the core of the preprocessing process, which uses the dictionary text to store the policy set and uses key–value pairs to store the relative position of each attribute in the dictionary content. The bitmap storage in the text part of the dictionary reduces the occupancy of space on a large scale. We have implemented a prototype of the proposed techniques and demonstrated its effectiveness and efficiency through extensive experiments compared with Sun PDP, XEngine and HPEngine. In the future, we will look for more optimized storage methods and policy matching rules. By classifying different types of policy sets, we will find the best method under different conditions and improve efficiency as much as possible under low space occupancy conditions. Acknowledgments This work is supported by the science research plan project of education department of Shaanxi Province, China (18JK0507), the natural science foundation of Shaanxi province in China (2017JQ6053), and the national natural science foundation of China (61702408). This work was also supported by the Innovation Group for Interdisciplinary Computing Technologies, College of Computer Science and Technology, China, Xi’an University of Science and Technology, China.

Please cite this article as: F. Deng, L. Zhang, C. Zhang et al., Establishment of rule dictionary for efficient XACML policy management, Knowledge-Based Systems (2019), https://doi.org/10.1016/j.knosys.2019.03.015.

10

F. Deng, L. Zhang, C. Zhang et al. / Knowledge-Based Systems xxx (xxxx) xxx

References [1] P. Angulo, C.C. Guzmán, G. Jiménez, D. Romero, A service-oriented architecture and its ICT-infrastructure to support eco-efficiency performance monitoring in manufacturing enterprises, Int. J. Comput. Integr. Manuf. 30 (1) (2017) 13. [2] M.A. Rahman, E. Al-Shaer, Automated synthesis of distributed network access controls: a formal framework with refinement, IEEE Trans. Parallel Distrib. Syst. 28 (2) (2017) 416–430. [3] A. Mourad, H. Jebbaoui, SBA-XACML: Set-based approach providing efficient policy decision process for accessing Web services, Expert Syst. Appl. 42 (1) (2015) 165–178. [4] F. Turkmen, J.D. Hartog, S. Ranise, N. Zannone, Formal analysis of XACML policies using SMT, Comput. Secur. 66 (5) (2017) 185–203. [5] D. Ferraiolo, R. Chandramouli, R. Kuhn, V. HuExtensible, Access control markup language (XACML) and next generation access control (NGAC), in: Proceedings of the 2016 ACM International Workshop on Attribute Based Access Control, 2016, pp. 13–24. [6] F. Deng, L.Y. Zhang, Elimination of policy conflict to improve the PDP evaluation performance, J. Netw. Comput. Appl. 80 (4) (2017) 45–57. [7] F. Deng, L.Y. Zhang, B.Y. Zhou, J.W. Zhang, H.Y. Cao, Elimination of the redundancy related to combining algorithms to improve the PDP evaluation performance, Math. Probl. Eng. 7608408 (2016) 1–18. [8] W. Zhang, H. Song, D. Wang, J. Wang, BitMap-Based sharing image storage management scheme in transparent computing, Comput. Eng. Appl. 53 (13) (2017) 83–89. [9] M. St-Martin, A.P. Felty, A verified algorithm for detecting conflicts in XACML access control rules, in: Proceedings of the 5th ACM SIGPLAN Conference on Certified Programs and Proofs, 2016, pp. 166–175. [10] L.C. Sarkis, V.T. da Silva, C. Braga, Detecting indirect conflicts between access control policies, in: Proceedings of the 31st Annual ACM Symposium on Applied Computing, 2016, pp. 1570–1572. [11] B. Stepien, A. Felty, Using expert systems to statically detect dynamic conflicts in XACML, in: Proceedings of the 11th International Conference on Availability, Reliability and Security, 2016, pp. 127–136. [12] R.A. Shaikh, K. Adi, L. Logrippo, A data classification method for inconsistency and incompleteness detection in access control policy sets, Int. J. Inf. Secur. 16 (1) (2017) 91–113. [13] H. Jebbaoui, A. Mourad, H. Otrok, R. Haratya, Semantics-based approach for detecting flaws, conflicts and redundancies in XACML policies, Comput. Electr. Eng. 44 (5) (2015) 91–103. [14] D. Lin, P. Rao, R. Ferrini, E. Bertino, J. Lobo, A similarity measure for comparing XACML policies, IEEE Trans. Knowl. Data Eng. 25 (9) (2013) 1946–1959. [15] A. Mourad, H. Tout, C. Talhi, H. Otrok, H. Yahyaoui, From model-driven specification to design-level set-based analysis of XACML policies, Comput. Electr. Eng. 52 (5) (2016) 65–79. [16] F. Turkmen, Y. Demchenko, On the use of SMT solving for XACML policy evaluation, in: Proceedings of the 2016 IEEE International Conference on Cloud Computing Technology and Science, 2017, pp. 539–544. [17] L. Sun, J. Park, N. Dang, R. Sandhu, A provenance-aware access control framework with typed provenance, IEEE Trans. Dependable Secure Comput. 13 (4) (2016) 411–423. [18] D. Díaz-López, G. Dólera-Tormo, F. Gómez-Mármol, G. MartínezPérez, Managing XACML systems in distributed environments through Meta-Policies, Comput. Secur. 48 (2) (2015) 92–115.

[19] X. Wang, W. Shi, Y. Xiang, J. Li, Efficient network security policy enforcement with policy space analysis, IEEE/ACM Trans. Netw. 24 (5) (2016) 2926–2938. [20] D. Gope, M.H. Lipasti, Hash map inlining, in: International Conference on Parallel Architecture and Compilation Techniques, 2016, pp. 235–246. [21] W. Imrich, I. Peterin, Cartesian products of directed graphs with loops, Discrete Math. 341 (5) (2018) 1336–1343. [22] A.V. Vasilyev, V.B. Vasilyev, Difference equations in a multidimensional space, Math. Model. Anal. 21 (3) (2016) 336–349. [23] H. Tian, Y. Chen, C. Chang, H. Jiang, Y. Huang, Y. Chen, Dynamic-hash-table based public auditing for secure cloud storage, IEEE Trans. Serv. Comput. 10 (5) (2017) 701–714. [24] N. Auger, C. Nicaud, C. Pivoteau, Merge strategies: from merge sort to timsort, 2015, https://www.researchgate.net/publication/282679394_Merge_ Strategies_from_Merge_Sort_to_TimSort. [25] T. Hagerup, F. Kammer, On-the-fly array initialization in less space, 2017, https://www.researchgate.net/publication/320163445_On-theFly_Array_Initialization_in_Less_Space. [26] S. Rus, C. Alias, C. Alias, L. Rauchwerger, Region array SSA, in: Proceedings of the 15th international conference on Parallel architectures and compilation techniques, 2006, pp. 43–52. [27] F. Deng, S.Y. Wang, L.Y. Zhang, X.Q. Wei, J.P. Yu, Establishment of attribute bitmaps for efficient XACML policy evaluation, Knowledge-Based System 143 (C) (2018) 93–101. [28] Sun’s XACML implementation. http://sunxacml.sourceforge.net. [29] A.X. Liu, F. Chen, J.H. Hwang, T. Xie, Xengine: a fast and scalable XACML policy evaluation engine, in: Proceedings of 2008 the ACM SIGMETRICS international conference on Measurement and modeling of computer systems, 2008, pp. 265–276. [30] D.H. Niu, J.F. Ma, Z. Ma, C.N. Li, L. Wang, HPEngine: high performance XACML policy evaluation engine based on statistical analysis, J. Commun. 35 (8) (2014) 205–215. [31] Y. Le Traon, T. Mouelhi, A. Pretschner, B. Baudry, Test-driven assessment of access control in legacy applications, in: Proceedings of the 2008 International Conference on Software Testing, Verification, and Validation, 2008, pp. 238–247. [32] T. Mouelhi, F. Fleurey, B. Baudry, Y. Traon, A model-based framework for security policy specification, deployment and testing, in: Proceedings of the 11th international conference on Model Driven Engineering Languages and Systems, 2008, pp. 537–552. [33] T. Mouelhi, Y. Le Traon, B. Baudry, Transforming and selecting functional test cases for security policy testing, in: Proceedings of the 2009 International Conference on Software Testing Verification and Validation, 2009, pp. 171–180. [34] N. Dan, S. Hua-Ji, C. Yuan, G. Jia-Hu, Attribute based access control (ABAC)based cross-domain access control in service-oriented architecture (SOA), in: Proceedings of the 2012 International Conference on Computer Science and Service System, 2012, pp. 1405–1408. [35] W. She, I. Yen, F. Bastani, B. Thuraisingham, Role-based integrated access control and data provenance for SOA based net-centric systems, in: Proceedings of the Proceedings of 2011 IEEE 6th International Symposium on Service Oriented System Engineering, 2011, pp. 225–234. [36] K. Martiny, D. Elenius, G. Denker, Protecting privacy with a declarative policy framework, in: IEEE International Conference on Semantic Computing, 2018, pp. 227–234. [37] C.D.P.K. Ramli, H.R. Nielson, F. Nielson, The logic of XACML, Sci. Comput. Program. 83 (4) (2014) 80–105.

Please cite this article as: F. Deng, L. Zhang, C. Zhang et al., Establishment of rule dictionary for efficient XACML policy management, Knowledge-Based Systems (2019), https://doi.org/10.1016/j.knosys.2019.03.015.