Information Systems 72 (2017) 205–217
Contents lists available at ScienceDirect
Information Systems journal homepage: www.elsevier.com/locate/is
Privacy preserving mechanisms for optimizing cross-organizational collaborative decisions based on the Karmarkar algorithm Hui Zhu a, Hongwei Liu b, Carol XJ Ou c, Robert M. Davison d,∗, Zherui Yang e a
School of Management Guangzhou University, Guangzhou, China School of Management Guangdong University of Technology, China c Department of Management Tilburg University, The Netherlands d Department of Information Systems City University of Hong Kong, Hong Kong e Department of Technology and Operations Management Rotterdam School of Management Erasmus University Rotterdam The Netherlands b
a r t i c l e
i n f o
Article history: Received 2 February 2016 Revised 9 August 2017 Accepted 18 October 2017
Keywords: Collaborative optimization Privacy preserving mechanisms The Karmarkar algorithm Secure Multi-Party Computation (SMC) Secure Two-Party Computation (STC)
a b s t r a c t Cross-organizational collaborative decision-making involves a great deal of private information which companies are often reluctant to disclose, even when they need to analyze data collaboratively. The lack of effective privacy-preserving mechanisms for optimizing cross-organizational collaborative decisions has become a challenge for both researchers and practitioners. It is even more challenging in the era of big data, since data encryption and decryption inevitably increase the complexity of calculation. In order to address this issue, in this study we introduce the Karmarkar algorithm as a way of dealing with the privacy-preserving distributed linear programming (LP) needed for secure multi-party computation (SMC) and secure two-party computation (STC) in scenarios characterised by mutual distrust and semi-honest participants without the aid of a trusted third party. We conduct two simulations to test the effectiveness and efficiency of the proposed protocols by revising the Karmarkar algorithm. The first simulation indicates that the proposed protocol can obtain the same outcome values compared to no-encryption algorithms. Our second simulation shows that the computational time in the proposed protocol can be reduced, especially for a high-dimensional constraint matrix (e.g., from 100 × 10 0 to 10 0 0 × 10 0 0). As such, we demonstrate the effectiveness and efficiency that can be achieved in the revised Karmarkar algorithm when it is applied in SMC. The proposed protocols can be used for collaborative optimization as well as privacy protection. Our simulations highlight the efficiency of the proposed protocols for large data sets in particular. © 2017 Elsevier Ltd. All rights reserved.
1. Introduction The advent of big data analytics has enabled the measurement and analysis of variables that could only be measured with difficulty in the past, such as public sentiment, opinion, actual behavior and personal data. In order to optimize inter-organizational collaborative decision-making, the involved parties need to share and analyze both their own data, as well as data from their business partners and even competitors [1]. However, the involved organizations may be reluctant to share and disclose such information [2–4]. This leads to a significant challenge for researchers and practitioners [5]: How can organizations reconcile the contradictions between optimizing decision making based on the shared in-
∗
Corresponding author. E-mail addresses:
[email protected] (H. Zhu),
[email protected] (H. Liu),
[email protected] (C.X. Ou),
[email protected] (R.M. Davison),
[email protected] (Z. Yang). https://doi.org/10.1016/j.is.2017.10.008 0306-4379/© 2017 Elsevier Ltd. All rights reserved.
formation and simultaneously protecting the individual private information contained in the shared data? For instance, collaborative supply chain management emphasizes multi-party participation (e.g., among material suppliers, manufacturers, logistics providers and distributors) in order to both optimize supply chain arrangements as a whole [6]and achieve smooth collaboration between the two adjacent nodes of a supply chain in demand forecasting [7]. This multi-party participation can eliminate potential demand variations [8]. Similarly, a collaborative center based on distribution networks was proposed as a solution to harmonize third-party logistics providers who coordinated shipments between suppliers and customers [9]. However, for these distributed linear programming (LP) problems, supply chain participants are often reluctant to share their private information directly with third parties, considering the high level of risk that the private information may be disseminated to other unauthorised parties. As a result, concerns about information security
206
H. Zhu et al. / Information Systems 72 (2017) 205–217
Fig. 1. The overview of processing private data without a trusted third party.
significantly hinder collaborative decision making and optimal supply chain arrangements. Similar challenges exist in other areas, such as health care centers [10], credit and loan evaluation processes [11], government organizations [12], and database marketing and Web usage analysis companies [13], which use sensitive data from distributed databases held by different parties to make predictions and decisions. Despite the well-known benefits, many organizations are averse to sharing individuals’ private information. A commonly adopted strategy to address problems related to private information sharing is to assume the trustworthiness of the participants, or to assume the existence of a trusted third party to facilitate data exchange. However, in reality there cannot be a completely trusted third party where the pursuit of commercial interests is concerned. Therefore, in this study, we propose an encryption protocol by integrating Secure Multi-Party Computation (SMC) with the Karmarkar algorithm for sharing and calculating private information in the context of cross-organizational collaboration. We also address the issue of computation speed for processing big data. 1.1. Problem statement In this study, we introduce the SMC to address the abovementioned problems. As shown in Fig. 1, the information needed for collaboration often resides in different data holders’ individual databases. Moreover, such multi-party collaboration may be characterised by a least trust level due to the existence of direct competition between the parties. Nevertheless, all parties recognize the benefits brought by such a collaboration. Under the precondition of preserving privacy, all parties in the collaboration promise to provide their private data to achieve mutual benefit and meanwhile plan to minimize costs in a context without a trusted third party. 1.2. Our contribution Although the traditional SMC might be effective in handling small data sets, they are often inefficient for large data sets that are associated with big data. Even though a lot of effort has been spent on addressing this weaknesses, efficient and effective solutions of multi-party computations (MPC) based on large private inputs have yet to be developed. Moreover, regardless of computational efficiency, it is impossible to directly apply the theoretical MPC work for small datasets to form secure protocols with large datasets in distributed information analysis.
In order to address the above issues, we attempt to identify potential solutions so as to balance the practicality and efficiency issues in this study. Specifically, we introduce the revised Karmarkar algorithm to deal with privacy-preserving distributed LP for MPC and two-party computation (TPC) in the scenario of mutual distrust and semi-honest participants. Our contributions can be summarized as follows: (1) both collaborative effectiveness and efficiency can be achieved by applying the revised Karmarkar algorithm in the context of SMC, as confirmed by our simulation results; (2) The proposed protocols are suitable for large data sets, as validated by our stimulations; (3) The proposed protocols are applicable to both collaborative optimization and protecting private information. Following this introduction, the rest of this paper is structured as follows. Section 2 explains the preliminaries for our approach, focusing in particular on the reasons why we chose the Karmarkar algorithm for SMC. Section 3 defines the research problems, followed by the secure protocol with security proof and complexity analysis in multi-party collaboration. Then we detail the processes for two-party collaboration in Section 4. Finally, we experimentally validate the protocol and analyze the result in Section 5 and conclude the paper with implications and future work in Section 6. 2. The model 2.1. Original Secure Multi-Party Computation A number of scholars [e.g., [4,14]] have designed different information security mechanisms. Specifically, Yao [15] put forward a constant-round protocol for Secure Two-Party Computation (STC) in the presence of semi-honest adversaries in the context of auctions and bidding. His work has received wide attention in the field of information security. Specifically, the semi-honest situation is defined as: “participants are expected to follow protocol specifications [16], but they record intermediate values observed during the protocol which can be employed to compromise security” [17]. Thus the concept of STC was generalized to SMC [18], which has now became a subfield of cryptography “with the goal to create methods for parties to jointly compute a function over their inputs, and keeping these inputs private” [19]. In order to specify the security protocols and better apply the concept of SMC in reality, the efficient SMC frameworks were proposed [20]. Among them, the invertible-matrix and commoditybased approaches [21] were offered to modify the basic SMC model, subsequently suggesting a better balance of efficiency
H. Zhu et al. / Information Systems 72 (2017) 205–217
and privacy protection in SMC. Similarly, other researchers included a sub-protocol for improving the efficiency of unconditional SMC [22], as well as for reducing geometrical problems to scalar products [23]. When security depends on the intractability of the composite problem, the computationally secure scalarproduct protocol is considered as an effective solution [24]. Furthermore, the protocol, by means of fast private association rule mining, can also address issues related to the secure sharing of distributed information and improvements in efficiency [25]. Similarly, Evfimievski and his colleagues [26] presented a more efficient method of secure outsourcing for linear program computation by privacy-preserving transformation. Here it is worth mentioning the work by Du et al. [27], considering their classical SMC algorithm. In this study, we also compare our proposed algorithms and simulation results with those from Du et al. [27]. A brief overview of their method is described below. The maximum entropy problem is a linear programming problem with equality constraints. It is a special case of the following general form:
max ξ = C T X
s.t
(AT1 , AT2 , · · ·, ATn )T X = (bT1 , bT2 , · · ·, bTn )T X ≥0
(1)
Assume that C, X ∈ Rn is the information to be shared. The participant Pi has the private information Ai , bi , where Ai ∈ Rmi ×n ,bi ∈ Rmi (m1 + m2 + · · · + mn ≤ n ). Then the secure protocol is shown as follows. Step 1: Compute a dual feasible basis of the discussed problem, confirm the initial basic feasible solution, and establish the initial simplex tableau. Step 2: In the case of not leaking the private information, participants P1 , P2 , , Pn collaboratively check the test parameter σ . If σ ≤ 0, the optimal solution is obtained, then end the computation. Otherwise, move to the next step. Step 3: According to max (σ ) > 0, choose the substitution vari(B−1 b)
able, as θ = min( (B−1 P )i | (B−1 Pj )i > 0 ), B = N1 + N2 + · · ·Nm where j i i Pj represents participants other than participant Pi . Each participant Pi calculates θ i alone and obtains min (θ i ), and then compares the value of min (θ i ) to swap the variables. Step 4: Find out the new basic variable and its corresponding basic feasible solution, and obtain the new simplex tableau. Step 5: Repeat Step 2-Step 4, and end if σ ≤ 0. End The original secure two-party computation [27] is also described as follows. The participant Alice has a matrix M1 and a vector b1 , and the participant Bob has a matrix M2 and a vector b2 , where M1 and M2 are n × n matrixes, and b1 and b2 are n-dimensional vectors. Without disclosing their private inputs to the other party, participants Alice and Bob want to solve the linear equation problem: (M1 + M2 )x = b1 + b2 . In fact, the solution to the linear equation (M1 + M2 )x = b1 + b2 is equivalent to the solution to the linear equations P (M1 + M2 )Q Q −1 x = P (b1 + b2 ). If Alice knows M0 = P (M1 + M2 )Q and b0 = P (b1 + b2 ), she can solve the linear equation problem: M0 X = b0 , and thus get the final solution x, where x = Q x . The approach of a secure simplex tableau solves the problem of private information transfer for two parties. However it fits the calculation only in a low dimensional matrix, while the efficiency in a high dimensional matrix is compromised. Thus, we introduce the Karmarkar algorithm to revise the original SMC algorithm [27] in order to address the efficiency and effectiveness in the case of a high dimensional matrix, as described below.
207
2.2. Privacy-preserving computation based on Karmarkar algorithm Karmarkar [28] was the first scholar to propose the interior point method by the scaling algorithm that was later on recognized as a solution to solve LP problems in an effective manner. Given its solution, which is suitable for large matrix calculations, the Karmarkar algorithm has been increasingly used for optimization calculations in different fields and has been further developed by modifying the algorithm to achieve less computational cost and better computational efficiency [29–32]. Specifically, Todd and Burrell [33] suggested a method to solve the problem about unknown optimal values by a modification of the Karmarkar algorithm. Their modified Karmarkar algorithm was used to solve the primal and dual problems in estimating the optimal value [34]. Moreover, Adler et al. [35] presented the data structures and programming techniques for an implementation of Karmarkar’s algorithm that makes use of direct factorization at each iteration via an interpretative Gaussian elimination scheme. To improve computational efficiency, Todd [36] suggested the employment of subproblems in a Dantzig–Wolfe decomposition to reduce the number of inversion steps and achieve greater efficiency. Based on the same random test problems in Todd [36], Anstreicher and Watteyne [37] onsidered a new family of search directions, which may reduce by over 30% the total number of required iterations. Even if the optimal objective value or projections can’t be computed exactly, the Karmarkar algorithm can handle the private inputs properly by developing variants [38]. Moreover, this modification of the Karmarkar algorithm can generate scaled columns in numerical experiments for LP [39]. To summarize the above-mentioned studies, the Karmarkar algorithm was used to solve LP problems, which is an effective solution to maximize feasibility and optimality simultaneously and which use projective transformations to demonstrate a polynomial time complexity bound for LP that was far better than any previously known bound [40]. Meanwhile, the Karmarkar algorithm was considered practically efficient in handling real problems, especially for large and highly constrained LP problems [41,42]. In this study, we introduce an encryption algorithm based on the SMC to address privacy issues in collaborative optimization decisions. Assume there are participants P1 , P2 , , Pm , working together to solve the optimization problem of LP, in which Pi has private information described as Mi and bi , where Mi ∈ R(r−1 )×(n−2 ) , bi ∈ Rr−1 , together with the information to be shared C , X , where C , X ∈ Rn−2 . The corresponding mathematical model is described as follows: T
min ξ = C X
s.t
M1 + M2 + · · · + Mm X = b1 + b2 + · · · + bm X ≥0
(2)
Firstly, according to the method described as (2), we can obtain the following model (3):
min ξ = C T X
MX = 0 s.t eT X = n X ≥0
(3)
Where c j = c j − nf , ( j = 1, 2, · · · , n ), 0 = (0, 0, · · · , 0 )T ∈ Rn , m m 0 i=1 Mi i=1 bi M= ∈ Rr×n , B = N1 + N2 + · · · Nm , eT −1 1 m
m
m
MD , in which D refers eT to the Hessian Matrix that describes the local curvature of the and Pi has private information Ni =
208
H. Zhu et al. / Information Systems 72 (2017) 205–217
Fig. 2. Conceptual information exchange in SMC.
Karmarkar function. According to the algorithm, only the part of Cp will be a process of exchanging the private information CP = −[I − BT (BBT )−1 B]DC. However, there are two conditions: the number of participants is either three or above, where p ≥ 3; as well as the number of participants is two, where p = 2. When p ≥ 3 , the result B will be secured, resulting in the optimal result based on the Karmarkar algorithm. When p = 2, result B can be easily calculated by both participants. In this case, the Karmarkar algorithm cannot be used directly. Therefore (BBT )−1 must be calculated first, leading to the corresponding solution to calculate BT (BBT )−1 B. 3. Secure Multi-Party Computation (SMC) Information exchange is the most important part in the process of SMC. The information needs to be transmitted through a secure channel in order to achieve the confidentiality of those inputs by multiple participants. In order to start the calculation, we need to know which part of the information is public and which part is private for each step [43,44]. In this study, we distinguish three parties in the process of information. They are respectively input parties (IP), output parties (OP), and computational parties (CP). However, in different steps, the roles of these participants can be changed from one to another (for instance, the input party in the first step of the calculation can be the party to receive the output in the next step). The conceptual processes are different between SMC and STC. We explain the details below. 3.1. Information exchange in Secure Multi-Party Computation (SMC) In the process of sensitive information exchange, each participant has private information that can be randomly divided into m pieces in different proportions (considered safer as a method than being divided into equal proportions). Such a random split needs to satisfy the following two conditions: (1) the sum of the split input is the original total; (2) participants have the mechanism to check whether there is a conspiracy among others or not. As we show in Fig. 2, P1 has m pieces of a1 , a2 , . . . am , P2 has m pieces of b1 , b2 , . . . bm and Pm has m pieces of d1 , d2 , . . . dm . Secondly, each participant is assumed to have a random piece of information as an example. The splitting and transmission process involves a cycle from the first participant (i.e., the initiator) to the last participant. The whole piece of information is transmitted in a ring. In other words, all participants will receive an encrypted input: the encrypted input is automatically formulated by the algorithm and then sent to the next participant. We then randomly select the party P1 as the protocol initiator who starts the computation by sending out the first data segment. Every participant has an equal number of pieces of information. Therefore the transmission process involves a cycle from the initiator to the last participant. The whole piece of information is transmitted in a ring. The number of participants for this SMC protocol must be three or above [45]. Each participant exchanges information m times in
each round of the computation. In the last step, the initiating participant will make the final calculation result public when all the rounds of calculations are completed. The detailed process is presented in Fig. 2 . 3.2. Protocol process
Protocol 1. Matrix summation problem Input: p(p ≥ 3), private matrix of the participants Ni . Output: B = N1 + N2 + · · · Nm , and make sure Pi doesn’t know Nk (k = 1, 2, · · · , m ). 1. Each participant Pi , 1 ≤ i ≤ m randomly generates r × n ma trixes by themselves, Qi1 , Qi2 , Qi m, and Ni = m j=1 Qi j , the matrixes obey hybrid distribution; 2. Pi keep Qij , and send Qij to participant Pj ( j = 1, 2, · · · , m; j = i ), in which Pj represents participants other than participant Pj ; 3. Each participant receives the matrix in a random but sequential order; 4. For the assumption that rc = m, Si j = 0; rc refers to the time of communication; participant Pi receives the matrix from participant Pi−1 and sums up the matrix to gain Sij , then sends Sij to Pi+1 . 5. While rc! = 0 begin for j = 1 to m − 1 for i = 1 to m − 1 Pi sends Si j = Qi j + Si−1, j to P(i+1 )modm rc = rc − 1 End 6. The first participant Pi sums up the matrix to gain the final Sij , and shows the output B. Note: each multidimensional diagonal matrix has an infinite number of solutions in a general condition. As a result, it is hard for participants to identify or guess the private information. Nevertheless, it is a special case to make the matrices with m dimensions diagonal (where only m=1, the matrices’ solution is finite), in which the security level of a diagonal matrix is very low. 3.3. Security analysis In accordance with the principle of the distributing Protocol 1, the process follows the procedure described in Section 3.2. Firstly, every participant has private information that is randomly divided into m pieces in different proportions. Secondly, every participant randomly selects one piece and keeps it, and the rest of the pieces are randomly allocated to other participants. Every participant has an equal number of pieces. This means that each participant owns one piece of the information and one other piece of information from every other participant. So if Pi+1 and Pi−1 collude, they can only calculate the new Pi , which is redistributed by the protocol, but they cannot know Ni , which is the private information of participant Pi . As the number of participants increases, collusive participants have much more difficulty to calculate all the information of the parties involved. As a result, if most participants are honest (against collusion), the probability of leaking private information to another party moves asymptotically towards zero.
H. Zhu et al. / Information Systems 72 (2017) 205–217
Fig. 3. The adversary model based on [46,47].
Fig. 4. The adversary model proposed in our paper.
3.3.1. Descriptions of the adversary model We ground our adversary model based on [46,47] and also make the comparison. The adversary model in [46,47] can be summarized into a trusted third party T in addition to the original participants and assumes that T can provide the participants with a safe channel for secured communication (as shown in Fig. 3). In their models [46,47], the attacker can obtain any participant’s information that s/he has corrupted, but the transition process between T and those honest participants is not attacked. This is an ideal model which we also mentioned in Section 1.1. The ideal model [46,47] has complete security requirements. However, in reality it is extremely difficult to find a neutral third party trusted by all participants. So as an alternative to the thirdparty solution, we propose the simulators to compute the certified and secured process. In the adversary model in which no trusted third party is involved, the implementation process and the attacker’s position in the process are visualized in Fig. 4 and described in the three steps below. (1) In the step of corruption, all the participants belong to P, the attackers need to make sure who s/he will corrupt in the set of participants, as named as corruptees A, A ∈ P. Attackers can know the input of corruptees A, where {xi|pi ∈ A}. (2) In the step of calculation, the honest participants send information to the designated participants in each round of the calculation according to the requirements in protocols 1, 2 and 3. (3) In the step of output, each participant receives the output according to the requirements of the protocols 1, 2 and 3, the attacker can know all corruptees’every output. 3.3.2. The attacker’s capabilities and aims In general, there are two alternative assumptions of adversaries’ capabilities. First, the attacker is active and even intervenes in the protocols, with the corrupted participants executing the attacker’s command with respect to the revised protocols. Alternatively, the attacker is passive and the corrupted participants still execute the original protocols appropriately. That means the attacker can know the corrupted participants’ information (i.e. the intermediate information), which is sent to the designated participants in each round of the calculations, but the attacker does not change the protocols. In our model, we assume the role of passive attacker in the adversary. 3.4. The information leakage analysis Based on the information exchange of oblivious transfer (OT1n ) approach as shown in Fig. 2 in Section 3.1, each participant P is
209
assumed to have a random piece of information. Thus, the chance (i.e., probability) for participants to guess the correct original private data is 1 out of pm (meaning p1m ), which can be a very small probability if a large enough pm is chosen, where p is the number of participants, and m is the number of random vectors into which the original private data is divided (Figs. 5 and 6). For the illustration purpose, we explain the corresponding analyses related to the three-party protocol, four-party protocol, as well as, the n-party protocol. As an illustrative example, we assume m to be 10 in quantifying the information leakage. Then, in the twoparty protocol, the probability of information leakage is 3110 , equiv1 alently to 59049 . This is a much smaller probability than 13 in traditional Oblivious Transfer (OT1n ) approach. For the calculation associated with other numbers of participants, Table 1 below explains the probability and the amount of information leakage. In addition, we make a further explanation to quantify the amount of information leakage as follows. Assuming each participant has a private-data matrix Ni , the amount of original private data can be calculated as a matrix ( ni=1 Ni ). Then the amount of information leakage can be quantified as p1m · ni=1 Ni . Table 1 shows the probability and the amount of information leakage accordingly. It is clear that more participants (p) engage in the information exchange, and the bigger number of randomly assigned vectors (m), the significantly lower probability of information leakage and the significantly lower amount of information leakage.
3.5. Complexity analysis 3.5.1. Computational complexity: In Protocol 1, each round is computed m times, then m rounds have to be computed m2 times. Therefore, the computational complexity is represented as:
S ( m ) = m2 3.5.2. Communication complexity: In Protocol 1, each participant should transfer m − 1 pieces of information to other participants, resulting in the transformation with m(m − 1 ) times in the transfer stage. In the calculation stage, each participant has to transfer the pieces of information and results in a ring structure and therefore it has to be transferred m times in each round. Analogously, the information is transferred m times in m rounds. Thus, the communication complexity of Protocol 1 is represented as:
C ( m ) = m ( m − 1 ) + m2 = 2m2 − m 4. Secure Two-Party Computation (STC) 4.1. Information exchange in Secure Two-Party Computation (STC) In the process of STC, as we mentioned above, when p = 2, B can be easily calculated by both participants. As a result, directly using the Karmarkar algorithm in TPC is not appropriate. Alternatively, (BBT )−1 must be calculated first before sharing the results for calculating BT (BBT )−1 B. We therefore design a specific protocol for two participants’ collaboration based on the protocol of oblivious transfer (OT1 n ) [48,49], in which “a sender transfers one of potentially many pieces of information to a receiver, but remains oblivious as to what piece (if any) has been transferred” [50]. The whole process is presented in Fig. 7. When only two participants collaborate, we need to analyze how we can securely calculate c p = −[I − BT (BBT )−1 B]DC, particularly in deciding how to calculate the BT (BBT )−1 B. Accordingly, we define the following computing tasks:
H = (BBT )−1 = [(N1 + N2 )(N1 + N2 )T ]−1
(4)
210
H. Zhu et al. / Information Systems 72 (2017) 205–217
Fig. 5. Computational complexity in Protocol 1.
Fig. 6. Communication complexity in Protocol 1.
Table 1 The revised oblivious transfer (OT1 n )as proposed in our paper . The number of participants (p)
The number of random vectors into which the original private data is divided (m, if m=10 as the example)
The probability of information leakage ( p1m )
The amount of original private data ( ni=1 Ni )
3 4 n
10 10 10 10
1 59049 1 1048576
N1 + N2 + N3 N1 + N2 + N3 + N4 N1 + N2 + · · · + Nn
1 n10
The amount of information leakage ( p1m · ni=1 Ni )
(
)
1 N1 + N2 + N3 59049 1 N1 + N2 + N3 1048576
1 n10
(
+ N4 )
(N1 + N2 + · · · + Nn )
Note: m is the number of random vectors into which the original private data is divided. m is determined by participants. Here, we take m as 10 as an illustrative example to quantify the probability of information leakage and the amount of information leakage.
H. Zhu et al. / Information Systems 72 (2017) 205–217
211
Fig. 7. Conceptual information exchange in Secure Two-Party Computation (STC).
Fig. 8. The detailed process of oblivious transfer (OT1 n ) protocol for two-party computation.
4.2. Protocol process
BT (BBT )−1 B = (N1 T + N2 T )H (N1 + N2 ) = N1 T H N1 + N1 T H N2 + N2 T H N1 + N2 T H N2
(5)
For the matrix H, we design a protocol of two participants’ collaboration based on the protocol of oblivious transfer (OT1 n ) to calculate the matrix H by inverse of matrix and then share the result. According to the form of matrix BT (BBT )−1 B, Pi (i = 1, 2 ) can calculate Ni T HNi by themselves. Specifically, this oblivious transfer protocol allows the sender P1 , (who has his/her own private information matrix N1 ) to transmit a part of its inputs to the recipient in a manner that protects the private information of both parties. The detailed process is shown in Fig. 8. As shown in Fig. 8, in the oblivious transfer protocol the inputs from both parties have been mixed and combined with random numbers in a random sequence for distraction purposes. As a result, even though the attacker may be able to obtain some information about N2 , she/he cannot know the exact sequence of the information. Therefore the information is not identifiable by the counter-partner or the malicious attacker. We characterize this process as one of unidentifiability (meaning information is not identifiable). We call this process an Inverse Matrix Protocol in TPC and the process is further shown in detail in Protocol 2. Meanwhile, we also design a protocol of matrix multiplication in the two-party scenario that can compute the task of BT (BBT )−1 B The details of this process are presented in Protocol 3 below.
Protocol 2. Inverse Matrix Protocol in TPC Input: p( p = 2 ), Participants Pi (i = 1, 2 ) have their own private information matrix Ni Output: H = (BBT )−1 = [(N1 + N2 )(N1 + N2 )T ]−1 1. Participants P1 and P2 together appoint a number m, let the pm be large enough to reduce the probability that private information can be calculated; 2. P1 generates a r × n random matrix, M1 , M2 , , Mm to ad dress N1 = m i=1 Mi ; 3. P2 generates a random r! invertible matrix R2 ; 4. For each random matrix M j , j = 1, 2, · · · , m, P1 and P2 implement the following steps: 1 P1 randomly generates a private key k, where 1 ≤ k ≤ p; 2 P1 sends Q1 , Q2 , , Qp to P2 , and Q1 , Q2 , , Qp ; all the remaining P2 are random matrixes, only P1 knows the private key, Qk = M j , and P2 cannot know which P1 is and which Mj is; 3 For each P2 , i = 1, 2, · · · , p, P2 , can calculate Tji , T ji = R2 Qi + r j , where rj is also a random matrix; 4 By the oblivious transfer protocol OT1 n , P1 can get back Tj , where T j = T jk = R2 Qk + r j = R2 M j + r j ; 5. P2 calculates R = R2 N2 − m j=1 r j , and sends the result to P1 ; m 6. P1 calculates S= m j=1 T j + R = R2 N1 + j=1 r j + m R2 N2 − j=1 r j = R2 N1 + R2 N2 , and (SST )−1 = [R2 (N1 + N2 )(N1 + N2 )T R2 T ]−1 =
212
H. Zhu et al. / Information Systems 72 (2017) 205–217
(R2 T )−1 [(N1 + N2 )(N1 + N2 )T ]−1 R2 −1 = (R2 T )(BBT )−1 R2 −1 , then sends the result to P2 ;
−1
7. P2 calculates H = (BBT )−1 = [(N1 + N2 )(N1 + N2 )T ] and publishes the result.
by R2 ,
Note: each multidimensional diagonal matrix has an infinite number of solutions in a general condition. As a result, it is hard for participants to identify or guess the private information. Nevertheless, it is a special case to make the matrices with m dimensions diagonal (where only m=1, the matrices’ solution is finite), in which the security level of a diagonal matrix is very low.
When the protocol is carried out, the information
of participant Pi can be as: VIEW 1 (x, y ) =
presented (x, r1 , m1 1 , · · · , mt 1 )( VIEW 2 (x, y ) = (x, r2 , m1 2 , · · · , mt 2 )), where r i (i = 1, 2 ) is a random number generated by the participant pi and mi 1 (m i 2 ) represents the ith message which it has received. OUTPUTi (x, y )(i = 1, 2 ) represents the output of participant pi .
Definition. If there exist polynomial time algorithms, denoted S1
and S2 , we can say that privately computes f, as presented below.
Protocol 3. Matrix multiplication protocol in two-party Input: p( p = 2 ), Participants Pi (i = 1, 2 ) have their own private information matrix Ni Output: V = N1 H N2 1. Participants P1 and P2 together appoint a number m. Let the pm be large enough to reduce the probability that private information can be calculated; 2. P1 generates a r × n random matrix, M1 , M2 , , Mm to ad dress N1 = m i=1 Mi ; 3. P2 generates a random r! invertible matrix R2 ; 4. For each random matrix M j , j = 1, 2, · · · , m, P1 and P2 implement the following steps: 1 P1 randomly generates a private key k, where 1 ≤ k ≤ p; 2 P1 sends Q1 , Q2 , , Qp to P2 , and Q1 , Q2 , , Qp ; all the remaining P2 are random matrixes, only P1 knows the private key, Qk = M j , and P2 cannot know which P1 is and which Mj is; 3 For each P2 , i = 1, 2, · · · , p, P2 , can calculate Tji , T ji = R2 Qi H N2 + r j , where rj is also a random matrix; 4 By the oblivious transfer protocol OT1 n , P1 can get back Tj , where T j = T jk = R2 Qk H N2 + r j = R2 M j H N2 + r j ; 5. P2 calculates R = m j=1 r j , and sends the result to P1 m 6. P1 calculates S= m j=1 T j − R = R2 N1 H N2 + j=1 r j − m r = R N H N , then sends the result to P ; 2 1 2 2 j=1 j 7. P2 calculates V = N1 H N2 by R2 , and publishes the result. Note: each multidimensional diagonal matrix has an infinite number of solutions in a general condition. As a result, it is hard for participants to identify or guess the private information. Nevertheless, it is a special case to make the matrices with m dimensions diagonal (where only m=1, the matrices’ solution is finite), in which the security level of a diagonal matrix is very low. 4.3. Security Analysis In the two-party security analysis, we first introduce the semihonest participant model. In this semi-honest model, semi-honest participants are also called curious adversaries or passive attackers [51]. In theory, they are required to fully comply with the data processing protocol and data security. However, in reality, it is possible that semi-honest participants can obtain some intermediate result during the encryption process, and through the intermediate result they may try to analyze the other participant’s input data. As an alternative to the passive adversary model, there is a probability that a malicious attacker may interrupt the data transmission process. Such a malicious attacker may choose to modify the protocol of the encryption process in the course of the data calculation and transmission process. In this paper, we assume that two participants are semi-honest and they are presented following [52] and as described below. ∗ ∗ ∗ ∗ Assume f : 0, 1 × 0, 1 → 0, 1 × 0, 1 is a functionality, where f1 (x, y) and f2 (x, y) denote the first and second elements
of f(x, y), respectively; and is a two-party protocol for the computation process. The input of protocol can be presented as (x, y).
(S1 (x, f1 (x, y )), f2 (x, y ))
(x,y )∈
c ∗ ≡
0, 1
(VIEW1 (x, y ), OUTPUT2 (x, y )) (x,y)∈
( f1 (x, y ), S1 (y, f2 (x, y )))
(x,y )∈
∗
(6)
∗
(7)
0, 1
c ∗ ≡
0, 1
(OUTPUT1 (x, y ), VIEW2 (x, y )) (x,y)∈
0, 1
c
Where ≡ denotes computational unidentifiability (meaning in formation is not identifiable via the calculation); VIEW ( x, y ) and 1
VIEW2 (x, y ),
OUTPUT1 (x, y ) and OUTPUT2 (x, y ) are related random variables, defined as a function of the same random execution. The intermediate information is un-identifiable due to encryption. Specifically, the calculation process follows the following three specific rules. (1) Correctness: as long as no one acts as an attacker in the transmission process, participants can obtain the information which they need by following the protocols; (2) The sender’s security: based on the protocols, the receiver will only obtain the information which s/he wants, but cannot obtain any other information from the sender. (3) The receiver’s security: based on the protocols, the sender cannot know which information the receiver needs to know and which information is selected by the receiver. Theorem 1. Protocol 2 ensures that H = (BBT )−1 = [(N1 + N2 )(N1 + N2 )T ]−1 in which neither of the participants can know the private inputs of the other participant. Proof. We prove the security of Protocol 2 by constructing the simulator based on the above definitions. The working process of simulator S1 is as follows:
•
S1 simulates V IEW1 (N1 , N2 ), making it can be indistinguishable between { S ( N , H ) , − } and 1 1
{(V IEW1 (N1 , N2 ), OU T PU T2 (N1 , N2 ))} . Given the input(N1 ,
H), S1 selects an invertible matrix R2 to simulateR2 ; •
S1 can randomly select
N2 ,
to address[(N1 + N2 )(N1 + N2 )T ]−1 = N2 .
H, in which simulations are used in N1 and Further, N2 abides by its own rule to be simulated and produces the matrix that complies with the rules of N2 in which S1 is one of the simu lators. That means S1 randomly selects N2 to simulate N2 . At the
•
•
same time, N2 possesses symmetry and invertible matrixes. S1 can use the same random number r with P1 to generate a m × n matrix M1 , M2 , , Mm , where N1 = m i=1 Mi ; S1 generates a set of random matrixes ri (i = 1, 2, . . . , m ). For the simulator S1 , according to its own information and the rule of N2 , S1 can select N2 to address [(N1 + N2 )(N1 + N2 )T ]−1 = H,
and N2 is not the only option for N2 . As long as it meets the
rules of N2 , N2 can be the candidate for S1 to select.
H. Zhu et al. / Information Systems 72 (2017) 205–217
In this protocol, V IEW1 (N1 , N2 ) = (N1 , r, R2 M1 + r1 , R2 M2 + r2 , · · · , R2 Mm + rm , R2 N2 − m We assume that S1 (N1 , H ) = j=1 r j ).
(N1 , r, R M1 + r1 , R2 M2 + r2 , · · · , R2 Mm + rm , R2 N2 − m 2 . Because [(N1 + N2 )(N1 + N2 )T ]−1 = H, j=1 r j )
c
so
S1 (N1 , H ), − ≡ V IEW1 (N1 , N2 ), − . Then we set up the sim
ulator S2 to simulate V IEW2 (N1 , N2 ) and make it indistinguishable
213
input more information by him/herself. That requires a two-player game mechanism. Thus, the tradeoff between information conjecture and self-security is a crucial issue in secure information sharing processing applications. Under such circumstances, the participants are generally unwilling to conjecture other participants’ information, considering that this may also result in their own information being leaked.
from {H, S2 (N2 , − )} and {OU T PU T1 (N1 , N2 ), V IEW2 (N1 , N2 )}.
4.4. The information leakage analysis
S2 generates a r × n matrix Q2 p ), · · · , (Qm1 , Qm2 , · · · , Qmp )},
For the two-party protocol, participants Alice (P1 ) and Bob (P2 ) are used as the illustrative examples. Regarding the matrix H, we design a protocol of two participants’ collaboration based on the protocol of OT1n to calculate the matrix H by reversing the matrix and then sharing the result. Because of the way the oblivious transfer protocol works, Alice can decide the scalar product, but Bob could not learn which scalar product Alice has chosen. If the value of N1 has certain publicly-known properties, Bob might be able to guess N1 from the other p − 1 vectors, but even if Bob is able to recognize N1 ., the chance of guessing it correctly is 1 out of p (meaning 1p ) [53,54]. So in order to reduce the chance of correct guesses, we propose the following solution. Alice divides the vector N1 into m random vectors M1 , M2 , , Mm of which N1 is the sum, i.e., N1 = ni=1 Mi . Bob can use the above method to compute the Mi ·N2 + ri , where ri is a random number. As a result of the protocol, Alice receives Mi ·N2 + ri for i = 1, · · · , m. Because of the randomness of Mi and its random position in the matrix, Bob could not easily find out which one is Mi . Certainly, there is a 1p probability that Bob can guess the correct Mi , but since N1 is the sum of m in random vectors, the chance that Bob guesses the correct N1 is 1 out of pm (meaning p1m ), which can be a very small probability if pm is large enough. Meanwhile, Alice and Bob together appoint a number m, which is the number obtained by dividing the original private information into random vectors. Let pm be large enough to reduce the probability that private data can be calculated. Below we compare the probability of information leakage and the amount of information leakage in the traditional and the proposed approaches, as indicated in Table 2 below. In summary, in the two-party protocol, the probability of infor1 mation leakage is 2110 , which is equivalent to 1024 , if Alice and Bob choose 10 as the number of pieces into which the information is divided. The probability of information leakage is a much smaller probability than 12 . Accordingly, the amount of information leakage also decreases significantly.
{(Q11 , Q12 , · · · , Q1 p ), (Q21 , Q22 , · · · ,
where each element is uniformly distributed and can be presented as follows: S2 (N2 , − ) = {N2 , r, (Q11 , Q12 , · · · , Q1 p ), (Q21 , Q22 , · · · ,
Q2 p ), · · · , (Qm1 , Qm2 , · · · , Qmp ), (SST )−1 }, V IEW2 (N1 , N2 ) = {N2 , r, (Q11 , Q12 , · · · , Q1 p ), (Q21 , Q22 , L, Q2 p ), · · · , (Qm1 , Qm2 , · · · , Qmp ), (SST )−1 } For the random matrix Qij , we can get:
c
H, S2 (N2 , − ) ≡ (OU T PU T1 (N1 , N2 ), V IEW2 (N1 , N2 )) .
Theorem 2. Protocol 3 ensures that the calculation of V = N1 HN2 is both safe and secured, in which neither participant can know the private inputs of the other participant. Proof. We prove the security of Protocol 1 by constructing the simulator based on the above definition. The working process of simulator S1 i s described below:
S1 simulates V IEW1 (N1 , N2 ), making it can be indistinguishable {S1 (N1 , V ), −} and
between {(V IEW1 (N1 , N2 ), OU T PU T2 (N1 , N2 ))} . Given the input(N1 ,
•
V), S1 selects an invertible matrix R2 to simulateR2 ;
• S1 can randomly selectN , 2 simulation by N1 , N2 ;
to addressN1 HN2 = V, and make the
S1 can use the same random number r with P1 to generate a m × n matrix M1 , M2 , , Mm , where N1 = m i=1 Mi ; S1 generates a set of random matrixes ri (i = 1, 2, · · · , m ).
•
•
In this protocol, V IEW1 (N1 , N2 ) = (N1 , r, R2 M1 HN2 + r1 , R2 M2 HN2 + r2 , · · · , R2 Mm HN2 + rm , m j=1 r j ). We assume that S1 (N1 , V ) = (N1 , r, R2 M1 HN2 + r1 , R2 M2 HN2 + r2 , · · · , R2 Mm HN2 + m rm , j=1 r j ). Because N1 HN2 = V ,we can obtain
c
S1 (N1 , V ), − ≡ V IEW1 (N1 , N2 ), − . Then we set up the sim
ulator S2 to simulate V IEW2 (N1 , N2 ) and make it indistinguishable
from {V, S2 (N2 , − )} and {OU T PU T1 (N1 , N2 ), V IEW2 (N1 , N2 )}.
S2 generates a r × n matrix{(Q11 , Q12 , · · · , Q1 p ), (Q21 , Q22 , · · · ,
Q2 p ), · · · , (Qm1 , Qm2 , · · · , Qmp )}, and each element is uniformly distributed, which can be presented as follows: S2 (N2 , − ) = {N2 , r, (Q11 , Q12 , · · · , Q1 p ), (Q21 , Q22 , · · · , Q2 p ), · · · ,
( Qm1 , Qm2 ,
· · · , Qmp ), S};
V IEW2 (N1 , N2 ) = {N2 , r, (Q11 , Q12 , · · · , Q1 p ), (Q21 , Q22 , · · · , Q2 p ), L, (Qm1 , Qm2 , , Qmp ), S} For the random matrix Qij , we can get the result as follow: c
4.5. Complexity analysis
{V, S2 (N2 , − )}≡{(OU T PU T1 (N1 , N2 ), V IEW2 (N1 , N2 ))}. To further illustrate the risk of information conjecture in the case of two participants, here we assume that the probability and risk of information leakage is proportional to the times of speculation. In other words, if a participant attempts to make a conjecture of other participants’ private information, s/he must also
4.5.1. Computational complexity In Protocol 2, the fourth step is to conduct matrix multiplication and matrix combination by mp times respectively. The fifth step is to conduct matrix multiplication once and multiplication computing for m times. The sixth step is to conduct matrix multiplication once and multiplication computing for m times, as well as inverse the matrix once. The seventh step is to conduct matrix multiplication twice and inverse the matrix once. These details are summarized in Table 3. Similarly, in Protocol 3, the fourth step is to conduct matrix multiplication for 3mp times and matrix combination for mp times. The fifth step is to conduct matrix combination for m − 1 times. The sixth step is to conduct matrix combination m times. The seventh step is to conduct matrix multiplication once and inverse matrix once. The details are summarized in Table 3. 4.5.2. Communication complexity In Protocol 2, the fourth step is to conduct the process of OT1n . Therefore each matrix needs to swap data for p + 1 times. Because
214
H. Zhu et al. / Information Systems 72 (2017) 205–217
Table 2 The comparison of two oblivious transfer (OT1 n ) approaches. The number of participants (p)
The amount of original private data ( ni=1 Ni )
Traditional Oblivious Transfer (OT1 n ) Approaches The probability of information leakage ( 1p )
The amount of information leakage ( 1p · ni=1 Ni )
2
N1 + N2
1 2
1 2
(N1 + N2 )
The Revised Oblivious Transfer (OT1 n ) Approaches The probability of information leakage ( p1m ) 1 1024
if m=10
The amount of information leakage ( p1m · ni=1 Ni ) 1 1024
(N1 + N2 ) if m=10
Note: m is the number obtained by dividing the original private data into random vectors. m is determined by the participants together. Here, we assume m to be 10 as an illustrative example to explain the probability of information leakage and the amount of information leakage. Table 3 Computation complexity . Protocol 2
Protocol 3
Process of Computation
S(m1 ) Times of Matrix Combination
S(m2 ) Times of Matrix Multiplication
S(m3 ) Times of Inversing Matrix
S(m4 ) Times of Matrix Combination
S(m5 ) Times of Matrix Multiplication
Step 4 Step 5 Step 6 Step 7 Total
mp m m 2m + mp
mp 1 1 2 mp + 4
1 1 2
mp m−1 m mp + 2m − 1
3mp 3mp + 1
Table 4 Computation complexity.
Process of Computation
Protocol 2 Times of communication
Protocol 3 Times of communication
Step 4 Step 5 Step 6 Total
m( p + 1 ) 1 1 mp + m + 2
m( p + 1 ) 1 1 mp + m + 2
there are m matrixes, data need to be swapped for m( p + 1 ) times. In each of the fifth and sixth steps, one data swap is required to be conducted. The same processes, from steps 4–6, are applied to Protocol 3. Details are shown in Table 4. 5. Simulations 5.1. Simulation set 1: effectiveness In order to calculate the availability of the proposed approaches, we conducted the simulations and experiments in MatlabR2012b. Our simulation proves the effectiveness of the revised Karmarkar algorithm for SMC in two simulation examples respectively. The first simulation example includes 149 constraints, 236 variables and four participants. Four participants have the datasets, namel A1 , A2 , A3 , A4 and b1 , b2 , b3 , b4 , whereA1 ∈ R34 × 236 , A2 ∈ R46 × 236 , A3 ∈ R30 × 236 , A4 ∈ R39 × 236 , and b1 ∈ R34 , b2 ∈ R46 , b3 ∈ R30 , b4 ∈ R39 . The participants transform the datasets into a distributed form of the LP problem. For the privacy protection, we design the secure four-party computation steps by following Protocol 1. The second simulation sample includes 270 constraints, 156 variables and two participants who have the datasets, namel A1 , A2 and b1 , b2 respectively, where A1 ∈ R144 × 156 , A2 ∈ R126 × 156 , and b1 ∈ R144 , b2 ∈ R126 . The participants transform the datasets into a
S(m6 ) Times of Inversing Matrix 1 1
distributed form of the LP problem. For the privacy protection, we design the secure TPC steps following Protocol 2 and Protocol 3.The result is simulated in MatlabR2012b and listed in Table 5. As shown in Table 5, the revised Karmarkar algorithm for SMC can produce results consistent with those from the original Karmarkar algorithm in both cases of four participants and two participants. Specifically, the outcome value is 159.9234 in the simulation example that includes 149 constraints, 236 variables for four participants. In the case of two participants, the same result 1389 - can be obtained in the original and revised Karmarkar algorithms. These consistent results demonstrate that our approach is effective in terms of protecting confidential information and meanwhile achieving the same value when compared to the original Karmarkar algorithm. Furthermore, the simulations prove that the benefit of collaborative computation including the data from other participants is far better than only using an individual participant’s own data for both four participant and two participant cases. Firstly, in the case of four participants, if they compute the information on their own without collaboration, the benefit of each participant is only 8.5837, 46.4480, 18.7889 and 12.9263 respectively, significantly lower than the value of 159.9234 based on the collaboration by sharing information. Similarly, in the scenario of two-party participation, the benefit of collaboration is 1389, much more than what they can achieve by independent computation (viz., 82.8717 and 32.7701 respectively). Integrating all these results, our experiments validate how the collaborative benefit via the revised Karmarkar algorithm for SMC is much better than the individual benefit without collaboration. 5.2. Simulation set 2: efficiency In order to calculate the efficiency of the proposed approaches, we performed the simulation in a Windows 7, 64-bit operating system, containing 4-core Intel ®R i3-4130 CPU, and with 4GB of in-
Table 5 The results of simulation set 1. Number of Participants
Results m × n Matrixes
PThe Value of Original Algorithm
The Value of Revised Karmarkar Algorithm for SMC
4 2
149 × 236 270 × 156
159.9234 1389
159.9234 1389
Note: Due to the fact that the data is generated by Random Function, results slightly vary each time in different simulations. However, the relative relationships in comparing the results are the same. That means the conclusions in this study sustain across different simulations.
H. Zhu et al. / Information Systems 72 (2017) 205–217
215
Table 6 The results of stimulation set 2 . The Approach by [27]
This study
Constraint matrix
Self time(s)
Tic,toc time(s)
Self time(s)
Tic,toc time(s)
2×2 10 × 10 100 × 100 500 × 500 10 0 0 × 10 0 0
0.008 0.019 1.809 8.129 29.454
0.001323 0.005561 1.500944 7.88258 28.490909
0.007 0.012 0.682 5.698 21.765
0.0012484 0.005207 0.437152 4.560504 18.083403
Note: Note: the execution times are related to the performance of processors. The unit of time refers to second(s).
Fig. 9. The total computation time (Self Time).
Fig. 10. The total computation time ([Tic,toc] time).
stalled memory. For the comparison purpose, we benchmarked the protocols from [27], which proposes a Privacy-MaxEnt to finding a solution to variables (the probabilities) that satisfy all these constraints. After conducting the simulations, we calculate the protocol execution times using linear regression for the constraint matrixes: 2 × 2, 10 × 10, 100 × 10 0, 50 0 × 500 and 10 0 0 × 10 0 0.
Here, we compare the self-time (recorded by the computer itself) and the [Tic,toc] time (conducted by the standard program code). The results simulated in MatlabR2012b are summarized in 6, Figs. 9 and 10. Several interesting findings can be identified in Table 6, Figs. 9 and 10. While the constraint matrix is low-dimensional,
216
H. Zhu et al. / Information Systems 72 (2017) 205–217
the two approaches almost obtain the same results, 0.008 s versus 0.007 s in self time and 0.001323 s versus 0.0012484 s under the 2 × 2 constraint matrix. Similarly, the two approaches demonstrate similar trends under the 10 × 10 constraint matrix. However, the approach by [27] requires 0.005561 s and this study requires 0.005207 s to obtain the result. The results of execution time of these two approaches were approximately the same in the system. However, when the constraint matrix is high-dimensional, a notable difference between the two lines can be observed. Figs. 9 and 10 show that the proposed protocol in this study requires much less computation time than the approach by [27] when the dimensions of the constraint matrix reach 100 × 100 or higher. Moreover, the result indicates that the degree of divergence increases continuously along with the increasing number of constraint matrix dimensions. When the dimensions of the constraint matrix reaches 500 × 500, our approach can save 2.431 s in self time and 3.322 s in the [Tic,toc] time compared to [27]. When the dimensions of the constraint matrix reach 10 0 0 × 10 0 0, the difference of approximately 10 s is much more obvious. These results indicate that efficiency can be achieved in the revised Karmarkar algorithm when applied in the SMC. 6. Conclusion 6.1. Implications In this study, we have presented novel two-party and multiparty protocols based on the revised Karmarkar algorithm for optimizing cross-organizational collaborative decisions via privacypreserving mechanisms. The proposed protocols are applicable to different numbers of variables involved in the collaboration. The proposed protocols are suitable for large data sets as validated by our stimulation experiments. Furthermore, as pointed out by other scholars [55,56], computational efficiency is a major concern for practical solutions, especially for the case of large inputs and big data sets. In order to address the efficiency issue, we apply the Karmarkar algorithm into the context of SMC. In this study, we divide the original private data Ni into m random vectors to design the protocol, in order to reduce the probability of information leakage and the amount of leaked information. Moreover, our simulations proved its effectiveness to hold the confidential information in order to protect privacy, as well as to deal with the problem of large inputs. Meanwhile, our simulations suggest that the number of variables plays a major role in determining the two main factors of privacy preserving computation protocols, viz., security and complexity. 6.2. Future work In addition to the above contributions, this study opens up several research opportunities for future work. Firstly, this study only employed simulation data. The results can be further validated by real data from companies. We also encourage researchers to investigate how parallelism can improve the performance and security of our protocols in real contexts. We acknowledge the inherent limitations associated with information leakage associated with any encryption protocols. For this issue, the approaches of k-anonymity [46] and differential privacy [47] reduce the risk of disclosing private information from the data source. However, the approach of SMC in our paper can significantly reduce the risk of information leakage during the calculation process. In this study, we use the semi-honest model to explain the honest analysis in STC. Accordingly, we design the simulator to imitate intermediate data to define and analyze the security of the proposed protocols. However, the semi-honest partici-
pants still have a small chance to speculate about the intermediate data when they collaborate together. Although data leakage is inevitable and can be partially addressed by increasing the encryption key length the time computation for the honest participants will also increase accordingly due to the increased key length and thus result in inefficiency. We now quantify the probability and the amount of potential information leakage in our proposed algorithms. As the next step in future, we encourage researchers to improve our protocol in order to further address the honest problem between two participants based on game theory, particularly via the penalty function to constrain participants’ actual behavior. Acknowledgement We acknowledge funding support from the National Science Foundation of China Project Number 71671048 and from Aspasia research funding from Tilburg University, The Nethelands. References [1] X.-B. Li, S. Sarkar, Privacy protection in data mining: a perturbation approach for categorical data, Inf. Syst. Res. 17 (3) (2006) 254–270. [2] J. Boockholdt, Implementing security and integrity in micro-mainframe networks, MIS Q. 13 (2) (1989) 135–144. [3] D.N. Jutla, P. Bodorik, Y. Zhang, An architecture for users’ privacy-aware electronic commerce contexts on the semantic web, Inf. Syst. 31 (4) (2006) 295–320. [4] A. Rastogi, P. Mardziel, M. Hicks, M.A. Hammer, Knowledge inference for optimizing secure multi-party computation, in: Paper presented at the Proceedings of the Eighth ACM SIGPLAN Workshop on Programming Languages and Analysis for Security, ACM, 2013, pp. 3–14. [5] X.-B. Li, S. Sarkar, Protecting privacy against record linkage disclosure: a bounded swapping approach for numeric data, Inf. Syst. Res. 22 (4) (2011) 774–789. [6] M. Cao, Q. Zhang, Supply chain collaboration: impact on collaborative advantage and firm performance, J. Oper. Manage. 29 (3) (2011) 163–180. [7] M. Holweg, S. Disney, J. Holmström, J. Småros, Supply chain collaboration:: making sense of the strategy continuum, Eur.Manage.J. 23 (2) (2005) 170–181. [8] G. Zhang, J. Shang, W. Li, Collaborative production planning of supply chain under price and demand uncertainty, Eur. J. Oper. Res. 215 (3) (2011) 590–603. [9] H. Song, V.N. Hsu, R.K. Cheung, Distribution coordination between suppliers and customers with a consolidation center, Oper. Res. 56 (5) (2008) 1264–1277. [10] X.-B. Li, S. Sarkar, Against classification attacks: a decision tree pruning approach to privacy protection in data mining, Oper. Res. 57 (6) (2009) 1496–1509. [11] H. Lu, J. Vaidya, V. Atluri, Y. Li, Statistical database auditing without query denial threat, INFORMS J. Comput. 27 (1) (2014) 20–34. [12] X. Bai, R. Gopal, M. Nunez, D. Zhdanov, On the prevention of fraud and privacy exposure in process information flow, INFORMS J. Comput. 24 (3) (2012) 416–432. [13] B.K. Samanthula, Y. Elmehdwi, G. Howser, S. Madria, A secure data sharing and query processing framework via federation of cloud computing, Inf. Syst. 48 (2015) 196–212. [14] I. Wang, C.-H. Shen, K. Chen, T.-s. Hsu, C.-J. Liau, D.-W. Wang, et al., An empirical study on privacy and secure multi-party computation using exponentiation, in: Paper presented at the International Conference on Acoustics,Speech and Signal Processing (ICASSP), 3, IEEE, 2009, pp. 182–188. [15] A.C.-C. Yao, Protocols for secure computations, in: Paper presented at the IEEE 54th Annual Symposium on Foundations of Computer Science, 82, 1982, pp. 160–164. [16] X. Yi, Y. Zhang, Privacy-preserving naive bayes classification on distributed data via semi-trusted mixers, Inf. Syst. 34 (3) (2009) 371–380. [17] B. Malin, E. Airoldi, S. Edoho-Eket, Y. Li, Configurable security protocols for multi-party data analysis with malicious participants, in: Paper presented at the 21st International Conference on Data Engineering (ICDE), IEE, 2005, pp. 533–544. [18] O. Goldreich, S. Micali, A. Wigderson, How to play any mental game, in: Paper presented at the Proceedings of the Nineteenth Annual ACM Symposium on Theory of Computing, ACM, 1987, pp. 218–229. [19] M.M. Prabhakaran, Secure Multi-Party Computation, 10, IOS Press, 2013. [20] D. Vatsalan, P. Christen, V.S. Verykios, A taxonomy of privacy-preserving record linkage techniques, Inf. Syst. 38 (6) (2013) 946–969. [21] H. Polat, W. Du, Privacy-preserving top-n recommendation on horizontally partitioned data, in: Paper presented at the 2005 IEEE/WIC/ACM International Conference on the Web Intelligence, IEEE, 2005, pp. 725–731. [22] M. Hirt, U. Maurer, B. Przydatek, Efficient Secure Multi-party Computation, in: Paper presented at the Sixth International Conference on the Theory and Application of Cryptology and Information Security, Springer, 20 0 0, pp. 143–161. [23] M.J. Atallah, W. Du, Secure Multi-party Computational geometry, in: Algorithms and Data Structures, Springer, 2001, pp. 165–179.
H. Zhu et al. / Information Systems 72 (2017) 205–217 [24] C.A. Melchor, B. Ait-Salem, P. Gaborit, A collusion-resistant distributed scalar product protocol with application to privacy-preserving computation of trust, in: Paper presented at Eighth IEEE International Symposium on the Network Computing and Applications, IEEE, 2009, pp. 140–147. [25] V. Estivill-Castro, A.H. Yasien, Fast private association rule mining by a protocol for securely sharing distributed data, in: Paper presented at the Intelligence and Security InformaticsWorkshop, IEEE, 2007, pp. 324–330. [26] A. Evfimievski, R. Srikant, R. Agrawal, J. Gehrke, Privacy preserving mining of association rules, Inf. Syst. 29 (4) (2004) 343–364. [27] W. Du, Z. Teng, Z. Zhu, Privacy-maxent: integrating background knowledge in privacy quantification, in: Paper presented at Eighth SIGMOD international conference on Management of data. ACM, ACM, 2008, pp. 459–472. [28] N. Karmarkar, A new polynomial-time algorithm for linear programming, in: Paper presented at the Proceedings of the 16th Annual ACM Symposium on Theory of Computing, ACM, 1984, pp. 302–311. [29] E. Tardos, A strongly polynomial algorithm to solve combinatorial linear programs, Oper. Res. 34 (2) (1986) 250–256. [30] K.M. Anstreicher, A standard form variant, and safeguarded linesearch, for the modified Karmarkar algorithm, Math. Program. 47 (1–3) (1990) 337–351. [31] J. Dennis Jr., A. Morshedi, K. Turner, A variable-metric variant of the Karmarkar algorithm for linear programming, Math. Program. 39 (1) (1987) 1–20 https: //link.springer.com/article/10.1007/BF02592068. [32] B. Kalantari, Karmarkar’s algorithm with improved steps, Math. Program. 46 (1–3) (1990) 73–78. [33] M.J. Todd, B.P. Burrell, An extension of Karmarkar’s algorithm for linear programming using dual variables, Algorithmica 1 (1–4) (1986) 409–424. [34] M.J. Todd, Exploiting special structure in Karmarkar’s linear programming algorithm, Math. Program. 41 (1–3) (1988) 97–113. [35] I. Adler, N. Karmarkar, M.G. Resende, G. Veiga, Data structures and programming techniques for the implementation of Karmarkar’s algorithm, ORSA J. Comput. 1 (2) (1989) 84–106. [36] M.J. Todd, A dantzig-wolfe-like variant of Karmarkar’s interior-point linear programming algorithm, Oper. Res. 38 (6) (1990) 1006–1018. [37] K.M. Anstreicher, P. Watteyne, A family of search directions for Karmarkar’s algorithm, Oper. Res. 41 (4) (1993) 759–767. [38] D. Goldfarb, S. Mehrotra, Relaxed variants of Karmarkar’s algorithm for linear programs with unknown optimal objective value, Math. Program. 40 (1–3) (1988) 183–195. [39] K.O. Kortanek, M. Shi, Convergence results and numerical experiments on a linear programming hybrid algorithm, Eur. J. Oper. Res. 32 (1) (1987) 47–61. [40] I.J. Lustig, R.E. Marsten, D.F. Shanno, Interior point methods for linear programming: computational state of the art, INFORMS J. Comput. 6 (1) (1994) 1–14.
217
[41] J.E. Mitchell, M.J. Todd, Solving combinatorial optimization problems using Karmarkar’s algorithm, Math. Program. 56 (1–3) (1992). 245–28 [42] I. Maros, G. Mitra, Strategies for creating advanced bases for large-scale linear programming problems, INFORMS J. Comput. 10 (2) (1998) 248–260. [43] R. Agrawal, R. Srikant, Privacy-preserving data mining, in: ACM Sigmod Record, 29, ACM, 20 0 0, pp. 439–450. [44] Z. Wang, Y. Luo, S.-c. Cheung, Efficient multi-party computation with collusion-deterred secret sharing, in: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2014), 2014, pp. 7401–7405. [45] D. Yao, K.B. Frikken, M.J. Atallah, R. Tamassia, Point-based trust: define how much privacy is worth, in: Paper presented at the Information and Communications Security, Eighth International Conference, Springer, 2006, pp. 190–209. [46] P. Samarati, L. Sweeney, Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression, Technical Report, Technical report, SRI International, 1998. [47] C. Dwork, A. Roth, et al., The algorithmic foundations of differential privacy, Found. Trends Theor. Comput.Sci. 9 (3–4) (2014) 211–407. [48] J.-S. Kang, D. Hong, A practical privacy-preserving cooperative computation protocol without oblivious transfer for linear systems of equations, Int. J. Inf. Process.Syst. 3 (1) (2007) 21–25. [49] V. Kolesnikov, Truly efficient string oblivious transfer using resettable tamper-proof tokens, in: Theory of Cryptography, Springer, 2010, pp. 327–342. [50] Y. Ishai, M. Prabhakaran, A. Sahai, Founding cryptography on oblivious transfer–efficiently, in: Paper presented at 28th Annual International Cryptology Conference, Springer, 2008, pp. 572–591. [51] T. Wang, Q. Wen, F. Zhu, Quantum communications with an anonymous receiver, Sci. China Phys. Mech.Astron. 53 (12) (2010) 2227–2231. [52] O. Goldreich, Secure multi-party computation (manuscript version 1.3), 2002. [53] M. Naor, B. Pinkas, Oblivious transfer and polynomial evaluation, in: Proceedings of the thirty-first annual ACM symposium on Theory of computing, ACM, 1999, pp. 245–254. [54] C. Cachin, S. Micali, M. Stadler, Computationally private information retrieval with polylogarithmic communication, in: International Conference on the Theory and Applications of Cryptographic Techniques, Springer, 1999, pp. 402–414. [55] R. Deitos, F. Kerschbaum, Improving practical performance on secure and private collaborative linear programming, in: Paper presented at the 20th International Workshop of DEXA on Database and Expert Systems Application, IEEE, 2009, pp. 122–126. [56] Y. Hong, J. Vaidya, Secure transformation for multiparty linear programming, Rutgers Technical Report, 2013.