Microprocessing and Microprogramming39 (1993) 155-158 North-Holland
155
GLOBAL DEADLOCK DETECTION FOR CONCURRENCY CONTROL IN MULTIDATABASE SYSTEMS Hongwoo Nam and Songchun Moon Computer Science and Engineering Division Department of Information and Communication Engineering Korea Advanced Institute of Science and Technology 207-43, Cheongryangni, Dongdaemun, Seoul 130-012, Korea With regard to the global deadlock resolution in multidatabase systems, so far there has been no satisfactory mechanism that preserves both local site autonomy and global serializability at the same time whilst pursuing high degree of concurrency. However, local site autonomy should be essentially preserved since it is nearly impossible for end user to modify the existing local database systems (LDBSs) in order to federate such LDBSs into MDBSs [Moon 90]. Also, global serializability should be essentially preserved for correctness of MDBSs [Moon 90]. Deadlock detection potentially allows greater concurrency than deadlock prevention approaches [Davi 92]. Therefore, there is a real need for such a novel deadlock detection mechanism that high degree of concurrency can be achieved without loss of global serializability and local site autonomy in MDBSs. On the basis of deadlock resolution, those I. Introduction mechanisms can be categorized into the following three approaches. First, no-wait approach attempts For the past decade, only a few studies have to break the waiting conditions which could cause focused on the global deadlock resolution for deadlock occurrences, so that no more deadlock multidatabase systems (MDBSs). To make matters resolution is necessary. Two mechanisms take this worse, so far there has been no satisfactory approach: (1) the do almost nothing mechanism in mechanism that preserves both global serializability lGlig 86] does nothing but forces data to be released and local site autonomy at the same time whilst immediately after the execution of each operation of pursuing high degree of concurrency. However, global transactions, and (2) the off-line updates local site autonomy should be essentially preserved mechanism in [Glig 86] allows only the off-line and since it is nearly impossible for end user to modify sequential updates. Second, deadlock prevention the existing local database systems (LDBSs) in approach essentially orders the way in which order to federate such LDBSs into MDBSs [Moon transactions claim locks, so that the cyclic waiting 90]. Also, global serializability should be never occur between global transactions. Three essentially preserved for correctness of MDBSs m e c h a n i s m s take this a p p r o a c h : (3) the [Moon 90]. Due to such preservation requirements, homogeneous LTMs mechanism in [Glig 86] there has been no deadlock detection mechanism generates only the equivalent serializable execution except one which fails to preserve local site schedules without the cyclic waiting in every autonomy, although deadlock detection potentially LDBSs, (4) the wait-die mechanism in [Kim 92] allows greater c o n c u r r e n c y than deadlock only allows an older transaction to wait for an prevention approaches [Davi 92]. Therefore, there is younger transaction when a conflict occurs between a real need for such a novel deadlock detection them, and (5) the deadlock free concurrency control mechanism that high degree of concurrency can be scheme in [Vidy 91] allows data access only in a achieved without loss of global serializability and rooted-tree fashion. Third, deadlock detection local site autonomy in MDBSs. approach checks for cycles in the wait-for graph of transactions, so that deadlocks can be detected II. Related work and problem definition explicitly. One mechanism takes this approach: (6) the distributed cycle-detection mechanism in [Sugi Previous mechanisms for deadlock resolution in 87] maintains a local serialization graph in each MDBSs can be summarized on the basis of the LDBS so that distributed cycle detection is possible. strength and weakness as in Table 1.
H. Nam, S. Moon
156
Table 1 Global deadlock resolution mechanisms in MDBSs Approaches
No-wait Deadlock prevention Deadlock detection
Mechanisms [Glig 86] 1 [Glig 86] 3 [Glig 86] 2 [Kim 92] [Vidy 91] [Sugi 87]
Global serializability Local site autonomy Degree of concurrency X X High .... C) 0 Low © X Low C) C) Low C) C)
C) X
[Glig 8611: the do almost nothing mechanism in [Glig 86] [Glig 86]2: the homogeneous LTMs mechanism in [Glig 86] [Glig 86]3: the off-line updates mechanism in [Glig 86] However, all the six mechanisms described above have significant drawbacks as follow. The do almost nothing mechanism fails to preserve global serializability and local site autonomy since the immediate release of data may cause the unserializable execution schedules and some LCCSs should be modified to release data immediately. The homogeneous LTMs mechanism and the distributed cycle-detection mechanism fail to preserve local site autonomy since LCCSs should be modified so that the equivalent serializable execution schedules without the cyclic waiting are generated in every LDBSs in the homogeneous LTMs mechanism and local deadlock resolution mechanisms should be modified so that manage the serialization graphs in the distributed cycle-detection mechanism. The offline updates mechanism, the deadlock free concurrency control scheme and the w a i t - d i e mechanism result in reduced concurrency since data should be updated only off-line and serially in the off-line updates mechanism, data should be accessed only in a rooted-tree fashion in the deadlock free concurrency control scheme, and the unnecessary restarts may be caused due to non-real deadlocks in the wait-die mechanism as shown in Example I. Example 1: Consider LDBS of MDBS where the S2PL rule with the wait-die rule is applied to transactions, globally. Let d be a data item stored at LDBS. Let G 1 and G 2 are transactions submitted to MDBS. Let E be a possible execution schedules. Assume the timestamp order is TS(GI) < TS(G2). That is, G2 is younger than G 1. GI: rl(d),
G2: r2(d) w2(d),
E: rx(d) rz(d) wz(d ).
Low High O: supported × : not supported
In the schedule E, since a younger transaction G2 causes a read-write conflict with an older transaction G~, G2 must be restarted by the wait-die rule. But, there is no real deadlock since no more conflicts occur between G1 and G2. So, we can find that G2 becomes restarted unnecessarily. Moreover, if G2 is a subtransaction of a global transaction, the abortion of G2 will cause the cascading rollback of all subtransactions of the global transaction. On the contrary, deadlock detection does not cause such unnecessary restarts since it never restarts a transaction unless the transaction becomes involved in real deadlocks. • In summary, three of those six mechanisms result in low degree of concurrency, while others fail to preserve local site autonomy. Therefore, there is a real need for such a deadlock detection mechanism that high degree of concurrency can be achieved without loss of global serializability and
local site autonomy. III. Global deadlock detection Global deadlock detection is difficult in MDBSs due to the following two reasons. First, since local site autonomy should be preserved, each LDBS needs not send its local wait-for graph (LWFG) to a global deadlock detector (GDD). Hence, the GDD may be unaware of the blocking status of transactions in LDBS. Second, due to local site autonomy, especially the commitment autonomy, MDBSs cannot abort the transactions blocked in LDBS. Hence, it is not strange that there has been no such deadlock detection approach that preserves global serializability and local site autonomy during the past decade. But, amazingly, such global deadlock detection becomes possible by making all
Global deadlock detection for concurrency control in MDBSs
LDBS 1. On the other hand, such access graph is n o t
local and global transactions submitted through the GDD to LDBS so that it is possible to detect all conflicts between data access operations w i t h o u t loss o f global serializability a n d local site autonomy [Kim 92]. Now, let us find the details for such difficulties in Example 2.
constructed in LDBS 1 with the STO scheme.
Figure 1(ii) shows the local wait-for graph (LWFG) for the schedule E2, where "r <---x' implies that data item x is currently held by transaction T, while 'x <---T' implies that T is waiting for a lock on x, which is currently held by another transaction. This LWFG represents the blocking status of transactions. After operations in E2 were scheduled in order, G12 must wait for L2 to release lock on d and L2 also must wait for G22to release lock on c. Fortunately, however, there is no deadlock in LDBS 2. On the other hand, although the LWFG may be constructed in LDBS z, such LWFG is not provided to the GDD, since local site autonomy should be preserved. At this point, there is no deadlock at each site. Nevertheless, two global transactions, G 1 and G 2 cannot proceed any more. In LDBS 1, L 1 and G21 would be disallowed to proceed until GI~ commit, due to the strictness property of the STO scheme. In LDBS 2, L2 and Glz would be disallowed to proceed until G22 commit, due to the strictness property of the S2PL scheme. However, unless G21 at LDBS 1 and G12 at LDBS 2 are allowed to proceed, both G 1and G 2 cannot commit, thus Gll and G22 are not able to commit. Consequently, G 1 and G z might be blocked permanently which leads to a deadlock situation as in Figure l(iii). Besides, if GEl and G12 are in the waiting state in LDBSs, each of them cannot be aborted by MDBSs for the preservation of local site autonomy. •
E x a m p l e 2" Consider an MDBS where the strict timestamp ordering (STO) scheme and the strict two phase locking (S2PL) scheme are used in LDBS 1 and LDBS2, respectively. Assume that all local and global transactions are submitted through the GDD to LDBSs. To simplify the discussion, assume data becomes updated immediately, a data item should be read before written, and the S2PL rule is used for global concurrency control. Let G 1 and G 2 be global transactions. Let Gij be a subtransaction of G i executed at LDBSj. Let L 1 and L2 be local transactions submitted to LDBS 1 and LDBS 2, respectively. Let a, b are data items stored at LDBS~ and c, d are data items stored at LDBS 2. Let E l and E2 be possible execution schedules at LDBS 1 and LDBS 2, respectively. Assume the timestamp order is TS(G~) < TS(LI) < TS(G21) for the transactions submitted to LDBS v Gi: rsl(b) wgl(b) rBl(d), Gll: rgll(b) Wgll(b), G12: rg12(d), G2: r~2(a) r~2(c) wg2(c), G21: rg21(a), G22: rg22(c) Wg22(c), LI: rh(a) wh(a) rh(b), L2: rl2(d) WlE(d) rl2(C) El: rll(a) Wlx(a)rgH(b) wgla(b) rll(b) rgzl(a), E2: r12(d) Wl2(d)rg22(c) wg22(c) q2(c) rg]2(d).
As indicated in Example 2, the L W F G at each site is u n a v a i l a b l e to the G D D , but it must be available for global deadlock detection. So, in our global deadlock detection m e c h a n i s m , the G D D itself must construct the L W F G at each site, unlike in the t r a d i t i o n a l t r a n s a c t i o n w a i t - f o r graph mcchanism where the GDD acquires such L W F G s from L D B S s [Davi 92]. Also, the G D D must c o m b i n e all L W F G s to construct the global W F G (GWFG). (See Theorem 1)
Figure l(i) shows the access graph for the schedule E~, where 'T <---x' implies that transaction T has written data item x but has not committed yet, whereas 'x <---T' implies that T is waiting for the commitment of another active transaction that has already written x. This access graph represents the serialization order among transactions. After operations in E 1 were scheduled in order by the S2PL rule at global level, Gza must wait for L1 to commit since LI has written (locked) data item a but has not committed yet, and L~ also must wait for GI~ to commit. Fortunately, however, there is no deadlock in G21 ~
a
15 7
T h e o r e m 1: For the global deadlock detection without loss of global seriablizabilbity and local site autonomy in MDBSs, the GDD itself must construct the L W F G at each site and must c o m b i n e all such
G22 ~ , . . _ _ _ . . . c
Ll~~----~Ib Gu
La ~
b
d
C~L2 b -.......~. G {........~ d
G12
(i) Access graph in LDBS 1 (iii) Composition graph of global view (ii) L W F G in LDBS z F i g u r e 1 Relationship graphs between transactions
H. Nam, S. Moon
158 L W F G s to construct the G W F G .
i m p a c t on global serializability concurrency control scheme.
of global
Proof" Let Tlk and T ~ be transactions submitted to site k, for 1 < k < n. Let '---)' imply a wait-for relationship and ' ~ ' imply a transitive wait-for relationship that consists of one or more local wait-for relationships at a site. Here, a transitive relationship graph, Tlk ~ Tzk may represent a WFG, Tlk --~ T i ---) T0+I) ---) ... --) T(i+j) ---)Tzk at site k, fori, j > 0. (l) Let us prove that the GDD must combine all LWFGs to construct the GWFG. Consider such a global deadlock situation that there is the LWFG, Tlu ~ T2k --) Tlc,+t) at each site k, where k+l implies 1 when k reaches to n, for 1 < k < n. Since the LWFG, Tlk =~ T2k --~ Tit,÷1) is acyclic at each site k, deadlock is not detected in any site. Nevertheless, any one transaction cannot proceed more, for there is the global deadlock cycle represented by
IV. Conclusions In this paper, we proposed a n o v e l global deadlock detection m e c h a n i s m for c o n c u r r e n c y control in MDBSs, where the GDD itself construct the L W F G at each site. Our mechanism achieves high degree of concurrency without loss of global serializability and local site autonomy, but at the expense of m a i n t a i n i n g the transaction wait-for graphs. In particular, this mechanism is noticeable in the sense that there has been no deadlock detection m e c h a n i s m except one which fails to preserve local site autonomy.
References
the GWFG, Tll :=> T21 ~ TIE ~ TEE ---) ... --~ Tlk => T2u T1(~+1)~ ... --~ T1, ~
Tz~ ---)Tll. Such GWFG is the
union of all LWFGs. However, if even one LWFG, Tau =~ T2k ~ T1(~+1)is missed in the GWFG, the global deadlock cycle cannot be detected. Consequently, for the global deadlock detection, the GDD must combine all LWFGs to construct the GWFG. (ii) Let us prove that the GDD itself must construct the LWFG at each site. In MDBSs, LDBS at each site k needs not send its LWFG, Tlk ~ Tzk ---)T~(k÷l) to the GDD since local site autonomy should be essentially preserved. To make matters worse, if such LWFG is not available to the GDD, the GDD cannot find the transitive relationship between T~k and T2k, represented by T~k T2k, at site k when a local transaction L introduces an indirect wait-for relationship between Tlkand T 2k, represented by Tlu ~ L ~ T2k. So, the global deadlock cycle detection becomes impossible. To solve this problem, consequently, the GDD itself must construct the LWFG, Ttk ~ Tzk ---) Tl(k+l) at each site k. Fortunately, this becomes possible by submitting all local and global transactions through the GDD to LDBSs. • In our m e c h a n i s m , the p r e e m p t i o n of transactions blocked in LDBS is not required since the certification on deadlock occurrences can be accomplished before a transaction is submitted to LDBS. Moreover, the GDD can be implemented on the top of each LDBS without the modification of L D B S s . T h e r e f o r e , local site autonomy is preserved. On the other hand, the GDD does not
[Davi 92]
B. David and G. Jane, "Distributed Database Systems," Addition-Wesley Publishing Company, Inc., 1992. [Glig 86] A. Gligor and R. Popescu-Zeletin, "Transaction Management in Distributed Heterogeneous Database Management Systems," Information Systems, Vol. 11, No. 4, 1986, pp. 287-297. [Kim 92] Y. S. Kim, "Atomic Transaction Scheduling in Tightly Coupled Heterogeneous Distributed Databases," P h . D . Thesis, Department of Information and C o m m u n i c a t i o n s Engineering, Korea Advanced Institute of Science and Technology, Seoul, Korea, 1992. [Moon 90] S. C. Moon and W. Kim, "Update Synchronization in Heterogeneous Distributed Databases," TR-90-50, Department of Computer Science, Korea Advanced Institute of Science and Technology, Seoul, Korea, 1990. [Sugi 87] K. Sugihara, "Concurrency Control Based on Distributed Cycle Detection," IEEE Proceeding of 3rd International Conference on Data Engineering, Los Angeles, California, U.S.A., February 1987, pp. 267-274. [Vidy 91] K. Vidyasanker, "A Non-Two-Phase Locking Protocol for Global Concurrency Control in Distributed Heterogeneous Database Systems," IEEE Transactions on Knowledge and Data Engineering, Vol. 3, No. 2, June 1991, pp. 256-260.