Molecular solution to the 0–1 knapsack problem based on DNA computing

Molecular solution to the 0–1 knapsack problem based on DNA computing

Applied Mathematics and Computation 187 (2007) 1033–1037 www.elsevier.com/locate/amc Molecular solution to the 0–1 knapsack problem based on DNA comp...

113KB Sizes 1 Downloads 12 Views

Applied Mathematics and Computation 187 (2007) 1033–1037 www.elsevier.com/locate/amc

Molecular solution to the 0–1 knapsack problem based on DNA computing Majid Darehmiraki *, Hasan Mishmast Nehi Department of Mathematics, University of Sistan and Baluchestan, Zahedan, Iran

Abstract Many combinatorial optimization problems are known to be NP-complete. A common point of view is that finding fast algorithms for such problems using a polynomial number of processors is unlikely. However, facts of this kind are usually established for ‘‘worst’’ case situations, and in practice many instances of NP-complete problems are successfully solved in polynomial time by such traditional combinatorial optimization techniques such as dynamic programming, branch-andbound. New opportunities for an effective solution of combinatorial problems emerged with the advent of parallel machines. In this paper, we describe an algorithm which generates an optimal solution for the 0/1 integer knapsack problem on DNA computing.  2006 Elsevier Inc. All rights reserved. Keywords: DNA computing; 0–1 Knapsack problem; Knapsack problem

1. Introduction The first tools of molecular biology are used to solve an instance of the directed Hamiltonian path problem by Adleman in 1994. A small graph is encoded in the molecules of DNA and the operation of the computation are performed with standard protocols and enzymes. This experiment demonstrates the feasibility of carrying out computations at the molecular level [1]. The result showed that not only can DNA be used to solve a computationally difficult problem, but it also demonstrated the potential power of parallel, high-density computation of DNA molecules in solution. This parallelism allows DNA computers to solve painstaking problems such as the NP-complete problem with linearly increasing time, as compared with the exponentially increasing time required by the Turing machine. In this paper, we solve the 0–1 knapsack problem by DNA computing. The proposed approach involves the incubation of the solution space that contains the ‘‘sticker’’ DNA strands. Adleman [4] introduced an abstract model of molecular computation, called the ‘‘sticker model’’, which has random access memory to exploit information encoding. Its error rate of hybridization is lower than

*

Corresponding author. E-mail addresses: [email protected] (M. Darehmiraki), [email protected] (H. Mishmast Nehi).

0096-3003/$ - see front matter  2006 Elsevier Inc. All rights reserved. doi:10.1016/j.amc.2006.09.020

1034

M. Darehmiraki, H. Mishmast Nehi / Applied Mathematics and Computation 187 (2007) 1033–1037

that of the Adleman–Lipton model. The sticker model breaks the strand into bit strings of which each bit comprises several DNA molecules. A sticker is complementary to one and only one bit string. A DNA-based algorithm is developed to demonstrate a general solution based on the operations defined in Adleman experiments [2]. The rest of this paper is organized as follows: Section 2 introduces the 0–1 knapsack problem and Section 3 then presents the DNA algorithm for the 0–1 knapsack problem. Conclusions are finally drawn in Section 4, along with a summary of the results. 2. Binary knapsack (BKP) problem Knapsack problems have been studied intensively in the last decade, attracting both theorists and practicians. The theoretical interest arises mainly from their simple structure, which, on the one hand, allows the exploitation of a number of combinatorial properties and, on the other hand allows more complex optimization problems to be solved through a series of knapsack-type subproblems. From a practical point of view, these problems can model many industrial situations (e.g., capital budgeting, cargo loading, and cutting stock) as well as most classical applications. Suppose that we want to fill up a knapsack by selecting some objects among various objects (generally called items). There are n different items available and each item j has a weight of wj and a profit of pj. The knapsack can hold a weight of at most W. The problem is to find an optimal subset of items so as to maximize the total profits subject to the knapsack’s weight capacity. The profits, weights, and capacities are positive integers. Let xj be binary variables given as follows:  1 if item j is selected; xj ¼ 0 otherwise: The kanapsack problem can be mathematically formulated as follows: n X max p j xj ; j¼1

s:t:

n X

w j xj 6 W ;

j¼1

xj ¼ 1 or 0;

j ¼ 1; 2; . . . ; n:

This is known as the 0–1 knapsack problem, which is pure integer programming with a single constraint and forms a very important class of integer programming. 3. DNA computing for solution BKP DNA is a high-molecular weight compound. Its basic composition includes one phosphate group, one deoxyribose sugar, and one nitrogenous base. There are four kinds of nitrogenous bases: adenine (A), guanine (G), cytosine (C), and thymine (T). DNA is a double helix consisting of two single strand deoxynucleotide chains running in an antiparallel configuration. After determining the precise structure of DNA, many experimental methods have been invented including annealing, amplifying, melting, separating, cutting, ligating, and so on, which can be used to help discover the mechanisms of information storage and output. The basic assumptions are that the data can be encoded in DNA strands and are error-free, and that molecular biologic technologies can perform all computational operations. The models of DNA computing are based on different combinations of the following biological operations on DNA strands: 1. 2. 3. 4.

Melting/annealing: break apart/bond together two single DNA strands with complementary sequences. Synthesis of a desired DNA strand of polynomial length. Separation of the strands by length. Merging: pour two (or more) test tubes into one.

M. Darehmiraki, H. Mishmast Nehi / Applied Mathematics and Computation 187 (2007) 1033–1037

1035

5. Extraction: extract the strands that contain a given pattern as a substring. 6. Amplifying: make copies of DNA strands by using the polymerase chain reaction (PCR). 7. Polymerization: transform a single strand that has a portion of double-stranded subsequence into an entire double-stranded molecule. 8. Cutting: cut DNA strands by using restriction enzymes. 9. Ligation: paste DNA strands with complementary sticky ends by using ligases. 10. Substitution: substitute, insert, or delete DNA sequences by using PCR site-specific oligonucleotide mutagenesis. 11. Marking single strands by hybridization. 12. Destroying the marked strands. 13. Detection: given a tube, check if it contains at least one DNA strand. 14. Number: given a tube, count many DNA strands in it. 15. Clear: a test tube is denoted by T. Consider a particular bit position ‘‘p’’, the operation logically turns bit position p ‘‘off ’’ (value 0) on every strand in tube T. These operations are used to ‘‘molecular programs’’, whose input is a tube with DNA strands or molecules is ‘‘yes’’, ‘‘no’’ or (set of) tube(s) [3]. 3.1. Algorithm The algorithm procedure is in the following manner: • • • • •

Step Step Step Step Step

1: 2: 3: 4: 5:

Define strand. Generate solution space. Eliminate infeasible solution not complying with constraint. Calculate the value for objective function per strand. Compare the objective function value for all strands.

Each step is explained in the following: Step 1: Each strand is subdivided into eight non-overlapping regions in which each region is explained in the following (Table 1). Step2: Initially a strand is poured into tube T0 to represent the problem constraint. Then by using the below procedure produces BKP solution space: (Notice: For each of the n variables Xk (k = 1, . . ., n), two distinct ‘‘value sequences’’ of 20 nucleotides are assigned to represent values ‘‘1’’ and ‘‘0’’.) Procedure Init (T0, n, k) • T = T = {U} • For i = 1  n • Amplify (T0, T+, T) • Append ðT þ ; X 1ðnþ1Þkþi Þ

Table 1 Region no. 1

2

Bit 1  nk position Logical A’i meaning exiting

k(n + 1) k(n + 1) + n k(2n + 1) + n  k(2n + 1) + 2n k(2n + 2) + 3n k(3n + 2) + 3n 3k(n + 1) + 4n Bm’s binary value 0

Value 0:no domain exiting mapping 1:exiting 1

3

4

Xi’s exiting

AiXi value pair

0:no exiting 1:exiting

0 1

5

P

6

7

8

Ci’s binary value

CX

0

Flag area for bit carry over 0

0

0

1

1

1

1

Ai X i

1036

M. Darehmiraki, H. Mishmast Nehi / Applied Mathematics and Computation 187 (2007) 1033–1037

• Append ðT  ; X 0ðnþ1Þkþi Þ • Merge (T0, T+, T) • Next i Step 3: In this stage, first we change Ax to a binary number and, then compare it with the right hand side value (b) and then delete the infeasible solutions. In order to do this, first, by the below process, we add ai; to any strand, which is if xi = 1 equal ai, and if xi = 0 equal a0, a0 is a k bite binary number which value’s is zero (a = k(n + 1)). Procedure process_value (T0, n, a)s • For i = l  n • Extract ðT 0 ; X 1iþa ; T yes ; T no Þ • Append (Tyes, ai) • Append (Tno, a0) • Merge (T0, Tyes, Tno) • Next i The second stage to change Ax to a binary number is, primary valuating the places n(2n + 1) + n to 3n + k(2n + 2), which is done by the following process (which in that pos = k(2n + 1) + n, len = 2n + k): Procedure Init_value (T0, len, pos) For i = l  len Append(T0, Xpos+i) Next i The last stage to change Ax to a binary number is parallel_add process which passes parameters (c, a, b, f) with (k, k(n + l) + n, n(2 + 2k) + k, n(3 + 2k) + 2k): Procedure parallel_add (T0), • For i = l  n • For j = 0 to (c  1) • Extract (T0, X(a+k*i)j, Tp, Tq) • Extract (Tp, X 1bj , Ta, Tb) • Set (Tb, Xbj) • Clear (Ta, Xbj) • Set (Ta, Xf(j+l)) • Merge (T0, Ta, Tb, Tq) • Next j • For j = 0 to (n + k + l) • Extract ðT 0 ; X 1f j ; T fl ; T f 0 Þ • Extract ðT fl ; X 1bj ; T Al ; T A0 Þ • Set (TA0, Xbj) • Clear (TA0, Xfj) • Clear (TAl, Xbj) • Set (TAl, Xf(j+l)) • Clear (TAl, Xfj) • Merge (T0, TAl, TA0, Tf0) • Next j • Next i The upper procedure is also used for the compute value objective function.The last procedure for do step 3 is the below process which validates each strand in tube T0: Procedure comparator (T0, n, k) Amplify (T0, T4) For i = 1  k Extract ðT 4 ; X 1kð2nþ1Þþn ; T 1 ; T 3 Þ Extract ðT 1 ; X 0nkþi ; T drop ; T 4 Þ Extract ðT 3 ; X 1nkþi ; T 0k ; T 4 Þ

M. Darehmiraki, H. Mishmast Nehi / Applied Mathematics and Computation 187 (2007) 1033–1037

1037

Next i Merge (T0, T0k, T4) Step 4: The parallel_add procedure is executed to sum the objective function of cx. Before the parallel_add procedure is executed, strands ci are appended to tube T0, as governed by Xi in the execution of procedure Append_Ci(T0, n, k). Procedure Append_Ci(T0, n, k) For i = 1  n Extract ðT 0 ; X 0nðkþ1Þþi ; T para0 ; T para1 Þ Append (Tpara0, c0) Append (Tpara1, ci) Merge (T0, Tpara0, Tpara1) Next i In the upper procedure, c0 is a k bit binary number which value’s is zero. Procedure parallel_add compute sum the objective function of cx. Step 5: The procedure parallel_comparator (T0, n, k) is applied to determine which strands have maximum value by checking all strands in tube T0. 1. S = 0 2. Do 2. S++ 3. Extract ðT 0 ; X 1ðaþsÞ ; T ans ; T 0 Þ 4. if number (Tans) = 1 exit 5. if number (Tans) > 1 then 5.1. for I = s + 1, . . ., n 5.2. Extract ðT ans ; X 1aþi ; T i ; T ans Þ 5.3. IF number (Ti) = 1 exit 5.4. Amplify (Ti, Tans) 5.5. IF number (Ti) > 1 Then 5.6. wash (Tans) 5.7. Amplify (Ti, Tans) 5.8. next i 6. until number (Tans) > 0

4. Conclusions Because computers have obvious limits in storage, speed, intelligence, and miniaturization, the methods of DNA computation have arisen, especially for their efficient parallelism. In order to solve a practical issue, there are still some problems that need a further study in biologic technology. In this article, we highlight a DNA computing model to solve a binary knapsack problem which the time complexity of the proposed algorithm is O(n · k). References [1] L. Adleman, Molecular computation of solution to combinatorial problems, Science 266 (1994) 1021–1024. [2] W.Y. Chung, P.C. Chih, R.W. Kee, Molecular solutions to the binary integer programming problem based on DNA computing, Biosystem 83 (2005) 56–66. [3] Z. Tiina, Formal models of DNA computing: a survey, Proc. Estonian Acad. Sci. Phys. Math. 49 (2) (2000) 90–99. [4] L. Adleman, Computing with DNA, Sci. Am. 279 (1998) 34–40.