Theoretical Computer Science 552 (2014) 109–111
Contents lists available at ScienceDirect
Theoretical Computer Science www.elsevier.com/locate/tcs
Note
A note on sparse solutions of sparse linear systems Chen Yuan, Haibin Kan ∗ Shanghai Key Laboratory of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai 200433, China
a r t i c l e
i n f o
Article history: Received 29 January 2014 Received in revised form 24 July 2014 Accepted 31 July 2014 Available online 13 August 2014 Communicated by V.Th. Paschos Keywords: Sparse vector Hitting set
a b s t r a c t A vector with at most k nonzeros is called k-sparse. Given a linear system Ax = b (the rows of A are r-sparse), it has been shown by Damaschke [1] that enumerating nonnegative k-sparse solutions is fixed-parameter tractable. They present an algorithm reaching an O ∗ (r 2k ) time bound. Since this problem is closely related to the hitting set problem, they raise the question whether it is possible to improve this time bound to O ∗ (r k ). This paper investigates this problem and discovers that a refined analysis of a modified version of their algorithm leads to a smaller time bound O ∗ ((4r )k ). © 2014 Elsevier B.V. All rights reserved.
1. Introduction Given a nonnegative m × n matrix A = (ai j )m×n and an m-length vector b, our aim is to find a nonnegative solution x with at most k nonzero entries satisfying that Ax = b. Matrix A is called r-sparse if each row has at most r nonzero entries. The concept of sparse linear systems has its application in many domains. In [1], this idea comes from biology. Columns of A correspond to candidate proteins and rows correspond to peptides. Entry ai j is occurrences of peptide i in protein j. They are mostly zero. The real valued vector b indicates the measured amounts of peptides. The nonnegative solution x indicates which proteins are in the mixture, and their amounts. Finding a sparsest solution to a linear system is NP-hard in general [2,3]. This problem is closely related to a widely investigated problem, the hitting set problem. In the hitting set problem, we are given a ground set X of elements and a collection T of subsets of X . We need to find a smallest subset H to hit every subset in T . Row i represents a subset T i in T such that ai j is 1 if element j belongs to T i otherwise 0. b is an all 1’s vector. Assume that we have a k-sparse solution x to this sparse linear system. We require that element j belongs to subset H if x j > 0. Since b i > 0, there should be a nonzero entry ai j in row i such that x j > 0. Subset H hits every subset in T . As a result, we generate a hitting set of size k. Since there is an easy algorithm to reach the time bound O ∗ (r k ) for the hitting set problem, it is natural to ask whether the same time bound holds for sparse linear systems. Damaschke [1] presents two algorithms to reach time bound O ∗ (r k k!) and O ∗ (r 2k ) respectively. They further slightly improve this bound by a tree structure argument. We discover that their time bound can be improved to O ∗ ((4r )k ). 2. Main result Before giving our analysis, we briefly review the algorithm in [1]. Damaschke defines a minimal feasible column set in matrix A. A feasible column set C is a set such that there exists a nonnegative solution x to linear sparse systems such that
*
Corresponding author. E-mail address:
[email protected] (H. Kan).
http://dx.doi.org/10.1016/j.tcs.2014.07.029 0304-3975/© 2014 Elsevier B.V. All rights reserved.
110
C. Yuan, H. Kan / Theoretical Computer Science 552 (2014) 109–111
x j is positive if and only if column j belongs to set C . A minimal feasible column set C is a set such that if there exists a feasible column set D ⊆ C , then D = C . It has been shown that a minimal feasible column set C is linearly independent. As a result, we only need to look for linearly independent sets C of size at most k. Theorem 1. (See [1].) Given a linear system Ax = b where all rows of A are r-sparse, we can enumerate all minimal feasible sets of size at most k in O ∗ (r 2k ) time. Proof. Damaschke [1] introduces auxiliary equations Q x = s where rows in Q are linearly independent. The construction of Q and s is determined by their algorithm. C is a column set and B is a row set. A [ B , C ] is the submatrix of A restricted to row set B and column set C . Similarly, A [C ] is the submatrix of A restricted to column set C and x[C ] is the vector x restricted to entries corresponding to C . Initially, C is an empty set. At each branching, they first check whether
A [C ] · x[C ] = b Q [C ] · x[C ] = s
(1)
has a nonnegative solution. It can be done in polynomial time by using linear programming. If so, C is a minimal feasible set. Otherwise, the algorithm continues. We either expand C by adding a new column j to C or expand Q [C ] · x[C ] = s by appending a new equation. Assume that we have no nonnegative solution for Eq. (1). There is some row i with a nonzero entry outside C . Otherwise, C has covered all nonzero entries of matrix A and thus C is a dead end. Let E be the set of columns j such that ai j > 0. We either randomly append a column j ∈ E to set C or append equation A [i , C ]x[C ] = b i to Q x = s. The size of E is at most r which is the number of nonzero entries in a row. Note that adding column index j to set C means that we allow x j of solution x to be nonzero. Thus, the number of nonzero entries in solution x increases by 1. Adding a new equation to Q x = s means that we do not allow any new nonzero entry of row i to be appended to set C . In this case, the number of nonzero entries of solution does not increase while the number of rows in Q increases by 1. Note that if we add a new equation to Q x = s, this row should not be selected for branching again. The branching number is r + 1. If we want to append a new equation to Q x = s, the new equation should be linearly independent of previous equations in Q x = s. Because Q [C ] has |C | columns, its rank is at most |C |. As a consequence, we could only add no more than k linearly independent rows to Q . Therefore, the size of C is at most k and Q has at most k linearly independent rows. This algorithm achieves time bound O ∗ ((r + 1)2k ). It is possible to improve this bound to O ∗ (r 2k ) by showing that every row has at most r − 1 nonzero entries outside C . 2 The following example shows how this algorithm proceeds. Example 1. Given a sparse linear system:
⎧ ⎨ x1 + x2 = 2 x2 + 2x3 = 1 ⎩ x1 + x3 = 1
(2)
Find minimal feasible sets of size at most 2 in this sparse linear system. Note that k = 2 and r = 2. At the beginning of this algorithm, C is an empty set. Because there is no nonnegative solution, we randomly pick a row. Assume we pick row 1. Since C is an empty set, we can only append element x1 or x2 to set C . We append element x2 to set C . We solve the linear equations:
⎧ ⎨ x2 = 2 x2 = 1 ⎩ 0=1 Note that we only allow element in C to be nonzero. Obviously, it has no nonnegative solution. Since row 2 has element x3 outside set C , we either append x3 to set C or append equation x2 = 1 to Q x = s. For the former choice, we solve the linear equations:
⎧ ⎨ x2 = 2 x2 + 2x3 = 1 ⎩ x3 = 1 Obviously, it has no nonnegative solution and algorithm stop branching (|C | = k = 2). For the latter choice, we solve the linear equations:
C. Yuan, H. Kan / Theoretical Computer Science 552 (2014) 109–111
111
⎧ x2 = 2 ⎪ ⎪ ⎪ ⎨ x2 = 1 ⎪0=1 ⎪ ⎪ ⎩ x2 = 1 Note that the forth equation corresponds to the auxiliary formulation Q x = s. Obviously, it has no nonnegative solution. Then, we select row 3 and append element x1 to set C = {x2 }. We update the equations:
⎧ x1 + x2 = 2 ⎪ ⎪ ⎪ ⎨ x2 = 1 ⎪ x1 = 1 ⎪ ⎪ ⎩ x2 = 1
There is a nonnegative solution that x1 = x2 = 1 and x3 = 0. Based on this algorithm, they [1] present an improved branching algorithm which slightly improves this bound to O ∗ ((r − 1 + 2/(r − 1))2k ). We carefully investigate their algorithm and obtain a better time bound. Theorem 2. For systems Ax = b where all rows of A are r-sparse, we can enumerate all minimal feasible sets of size at most k in O ∗ ((4r )k ) time. Proof. We slightly modify the algorithm to simplify our discussion. The modified algorithm selects row r j from top to bottom. If Eq. (1) has a nonnegative solution, C is a minimal feasible set. Otherwise, we start from the first row and pick the lowest index j such that row r j has some nonzero entries outside C and have not already been appended to matrix Q . Then, we have two options. We can either append a column to C or append the equation A [ j , C ]x[C ] = b j to Q x = s. If row A [ j , C ] is linearly dependent from other rows in Q , the algorithm only makes the former choice. We will analyze the leaf number of the search tree of this algorithm. Let x1 , x2 , . . . , x2k be a sequence with k integers chosen from {1, 2, . . . , r } and k 0s. It is easy to see that the total number 2k of such sequence is bounded by k r k . E is the set of column j such that ai j > 0. Since there are at most r elements in E, we label each element of E with a unique number from {1, 2, . . . , r }. The number 0 corresponds to the option that we append a new equation to Q x = s. At each branching, we either select a column index from the set E of size r or add a new equation to Q x = s. The former choice corresponds to a number from 1, . . . , r and the latter corresponds to number 0. Therefore, two different branchings cannot correspond to the same number. We come to the conclusion that two different leaves of the search tree are mapped to different sequences. The number of leaves is less than the number of sequences. Therefore, we obtain the time bound O ∗ ((4r )k ). 2 We compare our bound with the bound in [1]. Our bound is not always better than theirs. When r ≥ 5, our bound is smaller than O ∗ ((r − 1 + 2/(r − 1))2k ). There is another bound O ∗ (k!r k ) in [1]. When k ≥ 9, our bound is better than theirs. In summary, when r ≥ 5 and k ≥ 9, O ∗ ((4r )k ) is better than any bound in [1]. Acknowledgements We wish to thank the anonymous reviewers for valuable comments which greatly improve the quality of the paper. This work was supported in part by the National Natural Science Foundation of China under Grant 61170208, by the Shanghai Key Program of Basic Research under Grant 12JC1401400, and by the National Defense Basic Research Project under Grant JCYJ-1408. References [1] P. Damaschke, Sparse solutions of sparse linear systems: fixed-parameter tractability and an application of complex group testing, Theoret. Comput. Sci. 511 (2013) 137–146. [2] M.R. Garey, D.S. Johnson, Computers and Intractability. A Guide to the Theory of NP-Completeness, Freeman and Company, New York, 1979. [3] B.K. Natarajan, Sparse approximate solutions to linear systems, SIAM J. Comput. 24 (1995) 227–234.