Simplified proof of the Fourier Sampling Theorem

Simplified proof of the Fourier Sampling Theorem

Information Processing Letters 75 (2000) 139–143 Simplified proof of the Fourier Sampling Theorem Peter Høyer 1 BRICS, Department of Computer Science...

78KB Sizes 4 Downloads 115 Views

Information Processing Letters 75 (2000) 139–143

Simplified proof of the Fourier Sampling Theorem Peter Høyer 1 BRICS, Department of Computer Science, University of Aarhus, DK-8000 Århus C, Denmark Received 15 December 1999; received in revised form 2 June 2000 Communicated by P.M.B. Vitányi

Abstract We give a short and simple proof of Hales and Hallgren’s Fourier Sampling Theorem (Proceedings 31st Annual ACM Symp. Theory of Computing, ACM Press, 1999). The transparency of our proof-technique allows us to generalize and tighten their result.  2000 Elsevier Science B.V. All rights reserved. Keywords: Quantum computing; Fourier sampling; Analysis of algorithms

1. Introduction In recent years, the theory of quantum computing has been greatly developed and expanded. Two of the most striking results in the area are Grover’s algorithm for searching [3] and Shor’s algorithms for factoring and finding discrete logarithms [6]. For an excellent introduction to quantum computing, see for example [2]. Any quantum algorithm works on a finite Hilbert space H. Two types of operations are allowed, the first is unitary operators on H, the second is measurements of the whole or parts of the system. Since we are interested in the computational complexity of the algorithms, we restrict the operations allowed to only those that can be implemented efficiently. One of the primary operators used in the quantum algorithms developed so far is the quantum Fourier E-mail address: [email protected] (P. Høyer). 1 BRICS is an acronym for Basic Research in Computer Science,

Centre of the Danish National Research Foundation.

transform. Two of its main uses are to set up a quantum system in an initial state and to perform quantum Fourier sampling. The quantum Fourier transform is actually not a single operator, but a family of operators. One can define Fourier transforms for any finite group G. If the group G is Abelian, then there exists exactly one Fourier transform for G, and if G is non-commutative, then there are infinitely many Fourier transforms for G. For every integer n > 1, the quantum Fourier transform over the cyclic group Zn is defined by n−1 1 X ij ωn |iihj |, Fn = √ n

(1)

i,j =0

√ where ωn = exp(2π −1/n) denotes the nth principal root of unity. Quantum Fourier sampling over Zn P is, given a superposition |ui = n−1 u |ii, apply i=0 i the Fourier transform F n and measure the resulting superposition [1]. The measurement induces a discrete probability distribution D over the possible outcomes

0020-0190/00/$ – see front matter  2000 Elsevier Science B.V. All rights reserved. PII: S 0 0 2 0 - 0 1 9 0 ( 0 0 ) 0 0 0 9 9 - 5

140

P. Høyer / Information Processing Letters 75 (2000) 139–143

{0, 1, . . . , n − 1} where the probability for outcome i is |hi|F n |ui|2 . We have here adapted the Dirac notation that is commonly used in quantum algorithms. The “ket” notation | i is used to easily identify vectors from the Hilbert space, and the “bra” notation h | is similarly used to easily identify functionals from the dual space. For the purpose of this paper, one may think of these objects in terms of matrices. Then the ket |ii can be identified with the n × 1 column vector of all zeroes but a one at the ith entry. Similarly, the bra hj | can be identified with the 1 × n row vector of all zeroes but a one at the j th entry. Please see [2] for further information. In many quantum algorithms, including Shor’s celebrated algorithms for factoring and discrete logarithms [6], quantum Fourier sampling is an essential ingredient. But unfortunately, often quantum Fourier sampling involves at least one of two difficulties: either we do not know the order n of the group over which we would like to perform the sampling, or the order n is known, but it has large prime factors, complicating efficient implementations of the Fourier transform [5]. Overcoming these two difficulties has in earlier work been based on an intriguing idea of Shor [6]: Instead of performing quantum Fourier sampling over Zn , perform quantum Fourier sampling over Zm for some m sufficiently large compared to n. This idea, however, adds complications to the analysis of the modified algorithm; for example, one then has to show that the relevant data is still attainable via sampling from the modified distribution D0 . Recently, Hales and Hallgren [4] proposed a general technique for circumventing such complications. They showed that, for any input state |ui, the original distribution D is contained in the modified distribution D0 by restriction. This allows us to sample from D via sampling from D0 . We first explain the notation involved and then we state their theorem. Let 1 < N < M be integers. For any integer 0 6 i < N , let i 0 = biM/N + 1/2c denote a closest integer to iM/N , and set δi = i 0 − iM/N P . Note that |δi | 6 1/2. Given an input state |ui = N−1 i=0 ui |ii, set |vi = F N |ui and |wi = F M |ui. Let Dv : {0, . . . , N − 1} → [0, 1] denote the probability distribution induced by measuring |vi, that

is, Dv (i) = |hi|vi|2 . Define probability distribution Dw : {0, . . ., M − 1} → [0, 1] similarly. Let Dw0 : {0, . . . , N − 1} → [0, 1] denote the probability distribution by Dw0 (i) = c · Dw (i 0 ), where c = PN−1 defined 0 −1 ( i=0 Dw (i )) is the normalization factor. Thus, we obtain distribution Dw0 by restricting Dw to outcomes j for which j = i 0 for some 0 6 i < N , and then relabeling i 0 by i. Finally, for any two probability distributions D and D0 over {0, . . . , N − 1}, let PN−1 0 |D − D | = i=0 |D(i) − D0 (i)| denote their total variation distance. Theorem 1 (Hales and Hallgren). For any polynomial s(n), there exists a polynomial t (n) such that for all integers 6 2n and M > t (n)N , and all input states PN N−1 |ui = i=0 ui |ii, we have |Dv − Dw0 | 6

1 , s(n)

where |vi = F N |ui and |wi = F M |ui. Hales and Hallgren’s theorem says that we, for all 0 6 i < N , can match the probability of measuring |vi yields i with the probability of measuring |wi yields i 0 , up to a global normalization factor. In the next section, we give a short proof of their theorem, or rather, we give a short proof of a generalization of their theorem. We improve upon their result in two ways. Firstly, we show that we can also match the amplitudes, not only the probabilities, and secondly, we show that in Theorem 1, it suffices to pick t (n) to be on the order of s(n)n. The applications of Hales and Hallgren’s theorem are many. For instance, it allows a simplified proof of Shor’s theorem for factoring (see [4, Section 3]). When applying their theorem, we would use the Fourier transform F M instead of F N . We set up the input state |ui, apply F M and then measure the system. We repeat this experiment until the measurement produces an outcome j such that j = i 0 for some 0 6 i < N . When that happens, we output i and stop. By Theorem 2 below, the expected number of repetitions is on the order of M/N . Furthermore, by Theorem 2 below, it suffices to pick M to be on the order of N log2 (N), in which case the expected number of repetitions is on the order of log2 (N).

P. Høyer / Information Processing Letters 75 (2000) 139–143

2. A simple proof The key object in our proof is the operator r M RF M F −1 (2) A= N , N where R denotes the combined projection and permutation defined by R=

N−1 X

Lemma 3. For operator A = by Eq. (2),  2 N , Re(aii ) > 1 − 5 M |aij | 6

2 N |i − j |N M

where



|x|N =

0

|iihi |.

141

PN−1

i,j =0 aij |iihj |

given

(i 6= j ),

x mod N

if (x mod N) 6 N/2,

(−x) mod N

if (x mod N) > N/2.

i=0

We use operator R to rephrase Hales and Hallgren’s theorem in terms of operators, and then using operator A, we give a simple proof. Theorem 2. Let N > 16 and s > 1 be given. Then the following holds for all integers M > s ·(12N log2 (N)). P Let |ui = N−1 i=0 ui |ii be any normalized state. Let |vi = F N |ui and |w0 i = cRF M |ui where c > 0 is the normalization factor such that |w0 i has unit norm. Then

1

|vi − |w0 i 6 1 , |Dv − Dw0 | 6 4 , s s and r    M 1 1+O . c= N s In the calculations to come, we use many inequalities and bounds. Several of these bounds are not tight as our primary aim is to give a simple and basic proof. Operator A is not necessarily unitary, Pbut it is linear and can be written as a sum A = N−1 i,j =0 aij |iihj | where aij ∈ C is given by aij =

N−1 1 X k(i−j ) kδi ωN ωM N k=0

1 = N

N−1 X

k(i−j +δi N/M) ωN .

(3)

k=0

Note that |aij | 6 1 for all 0 6 i, j < N , that is, every coefficient has absolute value at most 1. The next lemma expresses that every diagonal element aii is close to 1, whereas every off-diagonal element aij (i 6= j ) has small absolute value. The lemma is a variant of Claim 1 in [4].

Proof. First consider the diagonal element aii for some 0 6 i < N . Since |δi | 6 1/2 then, for all kδ 0 6 k < N , we have Re(ωM i ) > cos(πN/M) > 1 − 2 5(N/M) . Thus, by Eq. (3), it follows that Re(aii ) > 1 − 5(N/M)2 . Now consider the off-diagonal element aij for some 0 6 i, j < N with i 6= j . If δi = 0 then aij = 0, so suppose otherwise. Using that the rightmost sum in Eq. (3) is a geometric series, rewrite aij =

N−1 δN 1 − ωMi 1 X k(i−j +δi N/M) 1 ωN = . N N 1 − ωi−j +δi N/M k=0 N

We upper bound the absolute value of the numerator in δN N . the above expression on the right, |1 − ωMi | 6 π M To lower bound the absolute value of the denominator, write   1 − ωi−j +δi N/M = sin π i − j + δi π N N M   i − j π 1 > sin π − 2 M. N For any real number x, we have that sin(πx) > 2|x| if 0 6 |x| 6 1/2 and, by symmetry, that sin(πx) > 2(1 − |x|) if 1/2 6 |x| 6 1. It follows that the absolute value of the denominator is lower bounded by π 1 2 |i − j |N − , N 2M allowing us to conclude that |aij | 6

2 N |i − j |N M

provided M > 8N .

2

Lemma 3 P tells us that operator A acts as the identity I = N−1 i=0 |iihi|, modulo some error terms.

142

P. Høyer / Information Processing Letters 75 (2000) 139–143

To analyze the “damage” caused by those error terms, write A = I + E. P 2 2 Let E = N−1 i,j =0 eij |iihj |. Then |eii | = |Re(aii ) − 1| 2 2 2 + | Im(aii )| = 1 + |aii | − 2Re(aii ) 6√ 10(N/M) , since |aii | 6 1 by Eq. (3). Hence |eii | 6 10 (N/M) and |eij | 6

2 N |i − j |N M

for i 6= j .

Lemma 4. For all states |vi of unit norm,



E|vi 6 3 N 2 + log (N) . 2 M Proof. We prove this lemma by rephrasing it in terms of matrices and vectors, and then introduce a vector norm and its induced matrix norm which we easily can bound. Hence, consider E an N × N matrix (eij )N−1 i,j =0 , and let Norm(·) denote the matrix norm defined by  Norm(B) = max kBxk2 : kxk = 1 , where kxk2 = (x ∗ · x)1/2 denotes the Euclidean norm of the N × 1 column vector x, and where x ∗ denotes the Hermitian adjoint of x. Then, clearly, we have that kE|vik 6 Norm(E), and thus it suffices to upper bound the matrix norm of E. For this, note that Norm(E) 6 Norm(|E|) where |E| denotes the matrix obtained by replacing each entry of E with its absolute value. Observe that, by Lemma 3, we have Norm(|E|) 6 Norm(C) where C = (cij )N−1 i,j =0 with  N   6 M cij = N 2    |i − j |N M

if i = j , otherwise.

Matrix C is circulant with positive real-valued entries and hence its norm is equalPthe sum ofPany row N−1 or any column, Norm(C) = N−1 j =0 c1j = i=0 ci1 . We upper bound the leftmost sum

N−1 X j =0

c1j

! N−1 X 1 N 6+2 = M |j |N j =1 ! bN/2c X 1  N N 6 + 4 ln(N) . 6+4 6 6 M j M j =1

Since 4 ln(N) 6 3 log2 (N), the lemma follows. 2 Lemma 4 quantifies that operator A is close to the identity. That bound implies a bound on the distance of the two states before and after applying A. P Lemma 5. Let |vi = N−1 i=0 vi |ii be any normalized state. Let |w0 i = √1 A|vi where b = kA|vik. Assume b kE|vik 6 1/2. Then |w0 i has unit norm and



|vi − |w0 i 6 5 E|vi . 2 Furthermore, the normalization factor is bounded by

1 1 (4) 1 − E|vi 6 √ 6 1 + E|vi . 2 b Proof. By definition, b = kA|vik = k(I + E)|vik, so 1 − kE|vik 6 b 6 1 + kE|vik. Eq. (4) follows since we assume kE|vik 6 1/2. Let |yi = |vi − |w0 i. Then   1 |yi = I − √ (I + E) |vi b   1 1 = 1 − √ |vi − √ E|vi. b b Hence,





|yi 6 1 − √1 + √1 E|vi 6 5 E|vi . 2 2 b b Finally, we require a fundamental result of Bernstein and Vazirani [1], saying that if the distance between any two states is small then their induced probability distributions are close. Lemma 6 [1, Lemma 3.6]. Let |vi and |w0 i be two normalized states with k|vi − |w0 ik 6 ε. Then the total variation distance between the probability distributions resulting from measurements of |vi and |w0 i is at most 4ε, |Dv − Dw0 | 6 4ε.

P. Høyer / Information Processing Letters 75 (2000) 139–143

This holds no matter what basis is used for the measurements. Theorem 2 follows immediately by composing Lemmata 4, 5, and 6. Proof of Theorem 2. Write 1 |w0 i = cRF M |ui = √ A|vi b √ 1 where c = √ M/N . By Lemma 4, we have

143

the system. Now, suppose we do not want to measure the state F N |ui, but instead first apply some other operations on it and then measure it. In that case, we cannot apply Hales and Hallgren’s theorem, but we can apply Theorem 2. The reason is that Hales and Hallgren’s theorem says that the distributions are close, not that the states themselves are close, as stated in Theorem 2. Acknowledgement

b

 9N N 2 + log2 (N) 6 log2 (N). d = E|vi 6 3 M 2M Since we assume M > s · 12N log2 (N), we have d 6 2 1 5 0 5 (1/s) 6 2 , so by Lemma 5, k|vi − |w ik 6 2 d 6 1/s and |1 − √1 | 6 d 6 1/s, and, finally, by Lemma 6, b |Dv − Dw0 | 6 4(1/s). 2

3. Discussion Our Theorem 2 generalizes Hales and Hallgren’s theorem in two ways. Firstly, if we want to apply Fourier sampling over ZN , then, by Theorem 2, it suffices to be able to implement the Fourier transform F M for some M which is only a log(N) factor larger than N . Thus, for such an M, we only need log log(N) additional qubits to implement F M instead of implementing F N . Secondly, in Theorem 2, not only are the distributions Dv and Dw0 close, but so are the states |vi and |w0 i just prior the final measurement. In the setup studied by Hales and Hallgren, we are given a state |ui on which we want to apply a Fourier transform F N which is immediately followed by a measurement of

I am very grateful to Sean Hallgren for helpful discussions. I would like to thank an anonymous referee for a careful reading and for many constructive suggestions, and Gilles Brassard and Joan Boyar for comments. References [1] E. Bernstein, U. Vazirani, Quantum complexity theory, SIAM J. Comput. 26 (1997) 1411–1473. [2] R. Cleve, An introduction to quantum complexity theory, in: C. Macchiavello, G.M. Palma, A. Zeilinger (Eds.), Collected Papers on Quantum Computation and Quantum Information Theory, World Scientific, Singapore, 2000. Also available at Los Alamos National Laboratory e-Print archive as . [3] L.K. Grover, Quantum mechanics helps in searching for a needle in a haystack, Phys. Rev. Lett. 79 (1997) 325–328. [4] L. Hales, S. Hallgren, Quantum Fourier sampling simplified, in: Proc. 31th Annual ACM Symposium on Theory of Computing, ACM Press, New York, 1999, pp. 330–338. [5] A.Yu. Kitaev, Quantum measurements and the Abelian stabilizer problem, 1995. Available at Los Alamos National Laboratory e-Print archive as . [6] P. Shor, Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer, SIAM J. Comput. 26 (1997) 1484–1509.