Infmmation Systems Vol. 18, No. 1, pp. 5542, 1993 Printed in Great Britain
0306-4379/93 $6.00 + 0.00
Pergamon Press Ltd
ALGEBRAIC OPERATIONS ON ENCRYPTED RELATIONAL DATABASES TZONG-CHEN WV,’ YI-MUNG ‘Department of Information
YEH~ and
CHINCHEN CHANGE
Management, National Taiwan Institute of Technology, Taipei, Taiwan 106, R.O.C.
*Institute of Computer Science and Information Engineering, National Chiao Tung University, Hsinchu, Taiwan 300, R.O.C. ?nstitute of Computer Science and Information Engineering, National Chung Cheng University, Chaiyi, Taiwan 621, R.O.C. (Received
10 October
1991; in revised form 30 June 1992)
Abstract-In this paper, we consider the problem of performing algebraic operations and their extensions with encrypted relational databases. Each tuple of a relation is enciphered by a cryptosystem baaed on the extended Chinese remainder theorem. We show that one can perform the Projection, the Cartesian Product and their composite operations, such as performing the Projection followed by the Cartesian Product and performing Cartesian Product followed by the Projection, with the encrypted tuples directly without deciphering them. We also show that there does not exist a secure way to protect data for performing Comparison operations, such as Selection, Union, . , etc., with encrypted relational databases. Key words: Relational database, database security, cryptosystem, encrypted tuples
relational algebra, computing with
1. INTRODUCTION
The requirements of using database systems in today’s enterprises, such as government, industrial and commercial, are significantly clear [l]. In order to maintain confidential or sensitive information stored in a database, we need to provide some security mechanisms for achieving data security and privacy [2-51. That is, confidential or sensitive information is enciphered by a cryptosystem. For the conventional encrypted information systems, one severe problem is that the data are deciphered into plaintext state before processing. In this state, the data are highly exposed to unauthorized intrusion, especially in distributed environments [6]. The above stated problem is known as computing with encrypted data [7,8]. Obviously, the main advantages of computing with encrypted data are enforcing data security and saving computing time. Besides, the additional costs of part of the operating system dealing with security considerations can be reduced. Recently, there have been several schemes [2,7-111 proposed for solving the problem of computing with encrypted data. However, most schemes excluding [9, lo] concentrated on update or arithmetic operations with single data jield. It is not satisfactory to perform algebraic operations with records or tuples for encrypted databases. In this paper, we consider the problem of performing algebraic operations and their extension with encrypted relational databases. Each tuple of a relation is enciphered by a record-oriented cryptosystem based on the extended Chinese remainder theorem [12]. Some algebraic operations and their extensions with encrypted tuples for relational databases are examined. From our results, we conclude that one can perform the Projection and the Cartesian Product operations with encrypted tuples. However, there does not exist a secure enciphering function for performing comparison-based operations, such as Selection, Union, . . . , etc., with encrypted tuples. Therefore, the database management system (DBMS) should take care of this security leak when performing comparison-based operations with encrypted tuples. 55
56
TZONO-CKEN WV et al.
2. DATABASE
ENCRYPTION
In 1981, Davida, Wells and Kam [13] proposed a database encryption method based on the Chinese remainder theorem. Lately, Lin, Chang and Lee [14] proposed a record-oriented cryptosystem for database sharing. The Lin-Chang-Lee method is based on the extended Chinese remainder theorem. Since our results on examining the algebraic operations for encrypted relational databases are derived from the Lin-Chang-Lee method, we first introduce their method. Given a relational database. Suppose that each tuple has n attributes, denoted as m, , m2, , . , , m,,. Let n positive integers 4, d2, . . . , d, be relatively prime and k a positive integer such that max@&6icn The encryption
procedure is stated as follows:
Encryption Procedure. Step 1: Let D = II;=, die Step 2: For i = 1, . . . , n do the following:
(1) Compute Bj = k(D/di). (2) Solve bi by Dibi E k mod k4. (3) Compute iVj= rmjdi/kl, where “j-1” is the ceiling operator. Step 3: Compute the ciphertext message C of mls as C = i D,b,N,mod kD. i=l To recover the ith attribute,
the decryption
is done by computing
where “Lj” is the floor operator. The pairs (k, di)‘s are used as the decryption subkeys for the corresponding attributes in an encrypted tuple. The following theorem proves that the above encryption procedures indeed work. Theorem 1. The extended Chinese remainder theorem [14] Let n positive integers d, , d,, . . . , d, be relatively prime. Let m,, m2, . . . , m, and k be positive integers satisfying max{mij,,irn
CRYPTOSYSTEMS
In general, there are 9 operations for relational algebra, named as Union, Set Difference, Cartesian Product, Projection, Selection, Intersection, Quotient, Join and Natural Join [l]. However, only 5 operations Union, Set Difference, Cartesian Product, Projection and Selection will be considered because the other 4 operations can be deduced from these 5. Since the operations Union, Set Difference and Selection clearly require a large number of comparisons, we represent these comparison-based operations as Comparison. In the following subsections, we will examine the operations Projection, Cartesian Product and Comparison, respectively, with the encrypted
Algebraic operations on encrypted relational databases
51
tuples to complete the whole system. Before going through, the following assumptions for tuples of relations are given: (1) Each tuple of a relation is enciphered by the Lin-Chang-Lee method. (2) Each name of attribute to a relation, denoted as attr-i, is associated with a distinct encryption key d,,,,i. For convenience, di and da,t,_iare exchangeable. (3) For the decryption subkeys (k, dJ’s, all d:s are relatively prime and the alue of k satisfies: where mi is the value for the ith attribute. < k < min{d}l siGn, maxImi),
1.
Let a and b be positive integers. If x is any positive y moda =x moda. proof: Since y = zab + x for some integer z, we have y mod a = (zab + x)mod a = x mod a.
integer and y = x mod ab, then
Q.E.D.
. , m,) be a tuple of a relation R, C be the encrypted tuple of r associated with Letr=(m,,m,,.. the decryption subkeys (k, d,), (k, d2), . . . , (k, d,). Theorem 2. m() is formed by choosing j among n attributes of r and then If a new tuple r’=(m;,..., permuting them for j < n and (k, d; ), . . . , (k, di ) are the corresponding decryption subkeys for attributes in r’, then the enciphered tuple of r’ determined by C is
C’ = C mod kD’, where D’=
fid;. i=l
PrOOf:
Let P be the set of projected attributes. From Theorem 1, we have C=
i
D,b,N,modkD
i=l
=
1 D,b,N, + 1 D,b,N, icP
i$P
mod kD. >
Since kD’ divides kD, by Lemma 1, we have C mod kD’ =
c D,b,N, + 1 D,b,N, i#P
isP
However,
Di = k(D/di)
mod kD’. >
implies Di mod kD’ = 0 for all i 4 P.
Thus, C mod kD ’ = c Di bi Ni mod kD ‘. iEP
From
Theorem
1, the encrypted
tuple of r’ is
C’=
i
DAbiNimodkD’,
h=l
which is identical to c D,b,N,mod kD’. isP
TZONG-CHEN Wu et al.
58 This,
D~b~~~ modlzl)’
C’ = i h=l
= C D,biNi mod kD’ tCP =
C mod kD’.
Q.E.D.
From Theorem 2, we conclude that one can perform the Project operation on encrypted tuples of some relations without deciphering them. 3.2. Cartesian Product Let r =(m,,.. . , m,) the decryption subkeys be the encrypted tuple For any two positive
be a tuple of a relation R, C be the encrypted tuple of r associated with (k, d, ), . . . , (k, d,). Let r’ = (ml, . . . , m:) be a tuple of a relation R’, C’ of r‘ associated with the decryption keys (k, d;), : . . , (k, d;).
integers x, y and gcd(x, y) = 1, we have the following property
PI:
(1)
PX +qy = 1, for some integers p and q.
Theorem 3. Ifanewtupler”=(m ,,..., m,,m; ,,,., ml) is formed by performing the Cartesian Product on relations R and R’, then the encrypted tuple of rn determined by C and C’ is C” = (PDC’ + P’D’C)mod
kDD’,
for some integers P and P’, where
Since (D, D’) = 1, by equation (I), we have PD + P’D’ = 1 for some integers P and P’. Let C” = (PDC’ + P’D’C)mod kDD’. Now we want to show that (i) LC”/di_jmOd k = mi for i = 1, . . . , S. (ii) LC”/d; Jmod k = ml for i = 1, . . . , t. Proof of (i): Since PD + P’D’ = 1, we have C” = (PDC’ + (P’D’C)mod
kDD’
= (PDC’ + (1 - PD)C)mod
kDD’
= (PD(C’ - C) f C)mod kDD’, which implies C” = ykDD’ + PD(C’ - C) + C, for some integer y. Consequently, we have LCR/diJmod k = L(ykllD ‘ + PD (Cl - C) + C)/dJmod = yk(D/d@’
k
+ P(D/d,)(C’ - C) + [C/d&rod
= P(D/di)(C’ - C) + LC/d, Jmod k.
k
(2)
Algebraic operations on encrypted relational databases
59
From Theorem 1, D,b, E k mod kdi and 0; bl = k mod kdi imply D,b, = a,k and Dlbl= ai k, for some integers a, and al. Again, C=C;=,D,b,N,modkDand C’=Z:=,D~b~N~modkD’. Let y,=Ef=,a,N,modkD and y;=Cf=,ajNjmodkD’. We have D,b,N, mod kD
C = i i=l
=
a,kN,mod kD
i i=l
=
k i a,N,modkD i=l
= ky, mod kD = xkD + ky,, for some integer x,
(3)
and C’=
2 DlbJNjmodkD’ i=l
=
a:kN; mod kD’
i
I
i=
=k
i
ajN;modkD’
i=l
=ky:modkD’ = x’kD’+
ky;,
(4)
for some integer x’. Equations (3) and (4) imply C = k(xD + y,) = uk, and C’ = k(x’D’ + y’,) = uk, where u =xD +y, and v =x’D’+y;. Thus, we can rewrite equation (2) as LC”/d, Jmod k = P(D/di)k(v
- u) + LC/d, jmod k
= LC/d, Jmod k = m,
for i=l,...,s. Proof of (ii): From the proof of (i), similarly, we can prove LC”/d, Jmod k = LC‘/d; Jmod k = ml
for i=l,...,t.
Q.E.D.
For any n positive integers D, , D2, . . . , D, and gcd(D,, D2,. . . , D,) = 1, let ”
G, =
Then we have the following property
n Di.
i#i
[2]:
(5) for some integers P,, P2, . . . , P,.
60
TZONG-CHENWV et al.
Let ri = (m,, , m,2, . . . , miai) be a tuple of a relation Ri, Ci by the encrypted tuple of ri associated with the decryption subkeys (k, di,), (k, d,), . . . , (k, &). From equation (5) and Theorem 3, we have the following result. Corollary 1. If a new tuple r = (ml,, m12, . . . , ml,, , m21, m22, . . . , rnh2, . . . , m,, , mn2, . . . , m,,J is formed by performing the Cartesian Product on relations R,, R2, . . . , R,, then the encrypted tuple of r determined by C, , C,, . . . , C,, is
forsomeintegersP,,P2,...,P,,whereP,G,+P2G2+...+P,G,=1, Gi=fidij j= I
and
G=fiDi. i=l
From Theorem 3 and Corollary 1, we conclude that one can perform the Cartesian Product operation on encrypted tuples of some relations without deciphering them. Extended from Theorems 2 and 3, we can easily derive the result of performing the Projection followed by the Cartesian Product and performing the Cartesian Product followed by the Projection. Thus, we conclude that one can perform the composite of Projection and Cartesian Product operations on encrypted tuples of some relations without dechipering them. 3.3. Comparison Since the algebraic operation Comparison, defined in Section 3, is based on the set of 6 relation operators Q = (<, >, =, #, 6, a}, we first examine the operations in Q. Given two values a and b with a relation operator q in Q. Consider the existence of a Boolean secure enciphering function S, the messages are hidden in a and b with q, denoted asf(a, b, q). By a Boolean secure enciphering function, once applied, the result offwill show true orfalse in return. Lemma 1 [6]. (1) f(a, b, >) is determined if and only if f(a, b, <) is determined. (2) f(a, b, <) is determined if and only if f(a, b, 2) is determined. (3) f(a, b, =) is determined if and only if (i) f(a, b, #) is determined, or (ii) f(a, 6, <) and f(a, b, 8) are determined. From Lemma 1, the following lemma can be easily derived. Lemma 2 [6]. For any secure enciphering functionf, then the following statement is true:
if Q’ is a subset of Q such that IQ’] = 2 and Q’ # { =, #},
“f(a, b, q), for all q E Q’, can be determined f(a, b, q) for all a E Q, can be determined.”
if and only if
Theorem 4 [6]. A secure enciphering function for an algebraic system that includes the ordering predicate “ < ” does not exist when the encrypted version of the distinguished constants can be determined. From Lemma 1, Lemma 2 and Theorem 4, we have the following result. corollary 2. A secure enciphering function for an algebraic system that includes a relation operator in { < , > , = , # , <, 2 > does not exist when the encrypted version of the distinguished constants can be determined. Till now, most of the known data encryption algorithms do not preserve the comparative relations, such as greater than, less than, equal, . . . , etc., between a plaintext and its corresponding
Algebraic operations on encrypted relational databases
61
ciphertext. We explain how the Lin-Chang-Lee method cannot be applied to Comparison operations in the following. Let two tuples r = (m,,m,, . . . , m,) and r’ = (m;, m;, . . . , m;) of relations R and R’, where mi and ml are with the same attribute name. By our assumption (2) stated in this section, the encryption subkey (k, di) of mi is identical to the encryption subkey (k, dJ of ml, i.e. di = dj . Can we compare mi and rn; directly by using the encrypted tuples C of r and C’ of r’? The answer is negative. Let the encrypted projection value in the ith attribute of r be ci, i.e. ci=Cmodkdi. Similarly, let the encrypted projection value in the jth attribute of rj be c(, i.e. c; = C’modkd;. From Theorem
1, we have mi = LC/diJmod k = j_ci/diJmod k,
(6)
and m( = LC’/d,‘Jmod k = Lc; Id; Amod k.
(7)
From equations (6) and (7), we know that mi = m( does not imply ci = c; or C = C’, due to the effect of ceiling function. Further, we know that mi > ml (mi < m() does not imply ci > c((ci < cl), or vice versa, due to the effects of ceiling function and modular operation. Thus, we conclude that there does not exist a secure enciphering function for performing Comparison operation with encrypted tuples for relational databases. 4. CONCLUSIONS We have examined the problem of performing algebraic operations and their extensions with encrypted tuples for relational databases, where each tuple of a relation is enciphered by a record-oriented cryptosystem proposed by Lin, Chang and Lee [14]. Figure 1 shows an algebraic cryptosystem for relational databases with Projection, Cartesian Product, the composite of Projection and/or Cartesian Product and Comparison operations. From Fig. 1, the concluded results for computing with encrypted tuples in relational databases are stated as follows:
(1) The Project, Cartesian Product, the composite of Project and/or Cartesian Product operations operations
can be completely applied to the ciphertext space only. operations, such as Union, Selection, . . . , etc., cannot be completely applied to the ciphertext space. That is, there does not exist a secure enciphering function
(2) The comparison-based
Projection
Composite of Projection and/or Cartesian Product Fig. 1. Relational algebraic cryptosystem.
62
TZONGCHENWV et al.
for performing these comparison-based deciphering the ciphertext into plaintext.
operations
with encrypted
tuples without
first
Another study’s results of the problem of performing algebraic operations with encrypted tuples for relational databases can be found in [9], derived from the database encryption method proposed by Davida, Wells and Kam [13]. Acknowledgement-The authors would like to thank the referees for their very useful comments which greatly improve the presentation of this paper.
REFERENCES [l] C. Wood, E. B. Femandez and R. C. Summers. Database security: requirements, policies, and models. IBM Sysr. J. 19 (2), 229-252 (1980). [2] N. Koblitz. A Course in Number Theory and Cryptography. Springer, NY (1987). [3] C. P. PlIeeger. Security in Computing. Prentice-Hall, Englewood Cliffs, NJ (1989). [4] R. L. Rivest, L. Adelman and M. L. Dertouzos. On data banks and privacy homomorphisms. Foundation of Secure Computations (Edited by R. A. Demillo, D. P. Dobkin, A. K. Jones and R. J. Lipton), pp. 1699179. Academic Press, NY (1982). [5] K. W. Yu and T. L. Yu. Superimposing encrypted data. Commun. ACM 34 (2), 48-54 (1991). [6] S. Ceri and G. Pelagatti. Distributed Databases: Principles and Systems. McGraw-Hill, NY (1984). [7] M. Abadi, J. Feigenbaum and J. Kilian. On hiding information from an oracle. J. Compur. System Sci. 39.21-50 (1989). [8] N. Ahituv, Y. Lapid and S. Neumann. Processing encrypted data. Commun. ACM 30 (9), 777-780 (1987). [9] G. I. Davida and Y. S. Yeh. Cryptographic relational algebra. In Proc. 1982 IEEE Symp. on Security and Privacy, Oakland, California, pp. 11l-1 16 (1982). [lo] D. E. finning. Cryptography and Data Security. Addison-Wesley, MA (1982). [I 11 J. D. Ullman. Principles of Database Systems, Computer Science Press, Pitman Publishing, London (1982). [12] N. Minsky. Intentional resolution of privacy protection in database systems. Commun. ACM 19 (3), 148-159 (1976). [13] G. I. Davida, D. L. Wells and J. B. Kam. A database encryption system with subkeys. ACM Trans. Database Syst. 6 (2), 312-328 (1981). [14] C. H. Lin, C. C. Chang and R. C. T. Lee. A record-oriented Symp. 1990, Hsinchu, Taiwan, pp. 328-332 (1980).
cryptosystem for database sharing. In Proc. Int. Cornpurer