Reverse–Complement Similarity Codes

Reverse–Complement Similarity Codes

Electronic Notes in Discrete Mathematics 21 (2005) 281–282 www.elsevier.com/locate/endm Reverse–Complement Similarity Codes Arkadii D’yachkova , Davi...

85KB Sizes 2 Downloads 105 Views

Electronic Notes in Discrete Mathematics 21 (2005) 281–282 www.elsevier.com/locate/endm

Reverse–Complement Similarity Codes Arkadii D’yachkova , David Torneyb , Pavel Vilenkina , Scott Whiteb a

Moscow State University, Faculty of Mechanics & Mathematics, Department of Probability Theory, Moscow, 119899, Russia. b

MS K710, Los Alamos National Laboratory, Los Alamos, NM, 87545, USA.

We discuss a general notion of similarity function between two sequences which is based on their common subsequences. This notion arises in some applications of molecular biology [7]. We introduce the concept of similarity codes and study the logarithmic asymptotics for the size of optimal codes. Our mathematical results announced in [6] correspond to the longest common subsequence (LCS) similarity function [1] which leads to a special subclass of these codes called reverse-complement (RC) similarity codes. RC codes for additive similarity functions have been studied in previous papers [2], [3], [4], [5]. Key Words — sequences, subsequences, similarity, DNA sequences, codes, code distance, rate of codes, insertion-deletion codes.

References [1] V.I. Levenshtein, Binary codes capable of correcting deletions, insertions, and reversals, J. Soviet Phys.—Doklady, 10, 707–710, 1966. [2] A.G. D’yachkov and D.C. Torney, On similarity codes, IEEE Trans. Inform. Theory, Vol. 46, No. 4, 1558–1564, 2000. 1571-0653/$ – see front matter © 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.endm.2005.07.044

282

A. D’yachkov et al. / Electronic Notes in Discrete Mathematics 21 (2005) 281–282

[3] A.G. D’yachkov, D.C. Torney, P.A. Vilenkin, and P.S. White, Reverse– complement similarity codes for DNA sequences, Proc. of ISIT–2000, Sorrento, Italy, July 2000. [4] P.A. Vilenkin, Some asymptotic problems of combinatorial coding theory and information theory (in Russian), Ph.D. dissertation, Moscow State University, 2000. [5] V.V. Rykov, A.J. Macula, C.M.Korzelius, D.C. Engelhart, D.C. Torney, and P.S. White, DNA sequences constructed on the basis of quaternary cyclic codes, Proc. of 4-th World Multiconference on Systemics, Cybernetics and Informatics, Orlando, Florida, USA, July 2000. [6] A.G. D’yachkov, D.C. Torney, P.A. Vilenkin, and P.S.White, On a class of codes for the insertion-deletion metric, Proc. of ISIT–2002, Lausanne, Switzerland, July 2002. [7] A.G. D’yachkov, P.L. Erdos, A.J. Macula, V.V. Rykov, D.C. Torney, C.S. Tung, P.A. Vilenkin, and P.S. White, Exordium for DNA codes, Journal of Combinatorial Optimization, Vol. 7, No. 4, 2003.