Theoretical Computer Science 551 (2014) 102–115
Contents lists available at ScienceDirect
Theoretical Computer Science www.elsevier.com/locate/tcs
Size lower bounds for quantum automata ✩ Maria Paola Bianchi, Carlo Mereghetti ∗ , Beatrice Palano Dipartimento di Informatica, Università degli Studi di Milano, via Comelico 39, 20135 Milano, Italy
a r t i c l e
i n f o
Article history: Received 24 December 2013 Received in revised form 14 April 2014 Accepted 6 July 2014 Available online 15 July 2014 Communicated by M. Hirvensalo Keywords: Quantum finite automata Descriptional complexity
a b s t r a c t We compare the descriptional power of quantum finite automata with control language (qfcs) and deterministic finite automata (dfas). By suitably adapting Rabin’s technique, we show how to convert any given qfc to an equivalent dfa, incurring in an at most exponential size increase. This enables us to state a lower bound on the size of qfcs, which is logarithmic in the size of equivalent minimal dfas. In turn, this result yields analogous size lower bounds for several models of quantum finite automata in the literature. © 2014 Elsevier B.V. All rights reserved.
1. Introduction While we can hardly expect to see a full-featured quantum computer in the near future, it is reasonable to envision classical computing devices incorporating quantum components. Since the physical realization of quantum systems has proved to be a complex task, it is also reasonable to keep quantum components as “small” as possible. Small size quantum devices are modeled by quantum finite automata (qfas), a theoretical model for quantum machines with finite memory. Thus, it is well worth investigating, from a theoretical point of view, lower limits to the size of qfas when performing certain tasks, also emphasizing trade-offs with the size of equivalent classical devices. Originally, two models of qfas are proposed: measure-once qfas [10,21], where the probability of accepting words is evaluated by “observing” just once, at the end of input processing, and measure-many qfas [3,17], having such an observation performed after each move. Several modifications to these two original models of qfas, motivated by different possible physical realizations, are then proposed. Thus, e.g., enhanced [22], reversible [12], Latvian [2], and measure-only qfas [8] are introduced. Results in the literature (see, e.g., [2,5,11,18]) show that all these quantum variants are strictly less powerful than deterministic finite automata (dfas), although retaining a higher descriptional power (i.e., their sizes can be significantly smaller than their equivalent classical devices). To enhance the low computational power of these “purely quantum” systems, hybrid models featuring both a quantum and a classical component are studied. Examples of such hybrid systems, all reaching the same computational power of classical automata, are qfas with open time evolution [13], qfas with quantum and classical states (qcfas) [28], and qfas with control language (qfcs) [5,19]. Here, we are interested in this latter model which, roughly speaking, can be described as follows. A qfc A can be regarded to as a computational device having a quantum processor, namely a qfa, controlled by a dfa. The state of the qfa is observed after each move by an observable with a fixed, but arbitrary, set of possible outcomes. On any given ✩ Partially supported by the Italian MIUR under the project PRIN-2010LYA9RH_005 “PRIN: Automi e Linguaggi Formali: Aspetti Matematici e Applicativi.” A preliminary version of this work was presented at the 11th Int. Conf. Unconventional Computation & Natural Computation (UCNC 2013), Milano, Italy, July 1–5, 2013. Corresponding author. E-mail addresses:
[email protected] (M.P. Bianchi),
[email protected] (C. Mereghetti),
[email protected] (B. Palano).
*
http://dx.doi.org/10.1016/j.tcs.2014.07.004 0304-3975/© 2014 Elsevier B.V. All rights reserved.
M.P. Bianchi et al. / Theoretical Computer Science 551 (2014) 102–115
103
input word x, a sequence y of outcomes is generated with a certain probability. The computation of A on x is accepting whenever y belongs to the regular language (the control language) recognized by the dfa. In [5,19], it is proved that the class of languages accepted with isolated cut point by qfcs coincides with the class of regular languages, and that qfcs can be exponentially smaller than their equivalent classical automata. It may be worth quickly noticing that a relevant difference between qfcs and qcfas [28] is to be pointed out in the communication policy between the two internal components: in qcfas a two-way information exchange between the classical and quantum parts is established, while in qfcs only the quantum component affects the dynamic of the classical one. A relevant feature of qfcs, of interest in this paper, is that they can naturally and directly simulate several models of qfas by preserving the size. This property makes qfcs a general unifying framework within which to investigate size results for different quantum paradigms: size lower bounds or size trade-offs proved for qfcs may directly apply to simulated types of qfas as well. In fact, the need for a general quantum framework is witnessed by several results in the literature (see, e.g., [1,3,4,6,7,9,20,27]) showing that qfas can be exponentially more succinct than equivalent classical automata, by means of techniques which are typically targeted on the particular type of qfa and not easily adaptable to other paradigms. So, to cope with this specialization problem, here we study size lower bounds and trade-offs for qfcs. After introducing some basic notions in Section 2, we show in Section 3 how to build from a given qfc an equivalent dfa. To this aim, we must suitably modify classical Rabin’s technique [24] since the equivalence relation we choose to define the state set of the dfa is not a congruence. On the other hand, this relation – based on Euclidean norm – allows us to directly estimate the cost of the conversion by a geometrical argument on compact spaces. We obtain that the size of the resulting dfa is at most exponentially larger than the size of the qfc. Stated in other terms in Section 4, this latter result directly implies that qfcs are at most exponentially more succinct than classical equivalent devices. Indeed, due to qfcs generality, this succinctness result transfers to other models of qfas simulated by qfcs such as measure-only, measure-many, and reversible qfas. Additionally, we here show how qfcs are also able to simulate Latvian and measure-only qfas, thus providing size lower bounds even for these two models. 2. Preliminaries 2.1. Linear algebra We quickly recall some notions of linear algebra, which are useful to describe quantum computing. For more details, we refer the reader to, e.g., [15,26]. The fields of real and complex numbers are denoted by R and C, respectively. √ Given a complex number z = a + ib, we denote its real part, conjugate, and modulus by zR = a, z∗ = a − ib, and | z| = zz∗ , respectively. We denote by Cn×m the set of n × m matrices with entries in C. Given a matrix M ∈ Cn×m , for 1 ≤ i ≤ n and 1 ≤ j ≤ m, we denote by M i j its (i , j )th entry. The transpose of M is the matrix M T ∈ Cm×n satisfying M T i j = M ji , while we ∗ let M ∗ be the matrix satisfying M ∗ i j = ( M i j )∗ . The adjoint of M is the matrix M † = ( M T ) . n×m For matrices A , B ∈ C , their sum isthe n × m matrix ( A + B )i j = A i j + B i j . For matrices C ∈ Cn×m and D ∈ Cm×r , m their product is the n × r matrix (C D )i j = k=1 C ik D kj . For matrices A ∈ Cn×m and B ∈ C p ×q , their direct sum and Kronecker (or tensor) product are the (n + p ) × (m + q) and np × mq matrices defined, respectively, as follows:
A⊕B=
A
[0]
[0] B
⎛
,
··· ⎜ . .. A ⊗ B = ⎝ .. . A n1 B · · · A 11 B
A 1m B
.. .
⎞
⎟ ⎠,
A nm B
where [0] denotes zero-matrices of suitable dimensions. When operations are allowed by matrix dimensions, we have ( A ⊗ B ) · (C ⊗ D ) = AC ⊗ B D and ( A ⊕ B ) · (C ⊕ D ) = AC ⊕ B D. A Hilbert space of dimension n is the linear space C1×n of n-dimensional complex row vectors equipped with sum and product by elements in C, in which the inner product ϕ , ψ = ϕ ψ † is defined, for ϕ , ψ ∈ C1×n .√From now on, for the sake of simplicity, we will write Cn instead of C1×n . The norm of a vector ϕ ∈ Cn is given by ϕ = ϕ , ϕ . For vectors ϕ ∈ Cn and ψ ∈ Cm , their direct sum is the vector ϕ ⊕ ψ = (ϕ1 , . . . , ϕn , ψ1 , . . . , ψm ) ∈ Cn+m . We recall the following properties, for ϕ , ψ, ξ, ζ ∈ Cn and r ∈ R, which will turn out to be useful in our calculations:
ϕ , ψ = ψ, ϕ ∗ = ψ ∗ , ϕ ∗ , r ϕ , ψ = r ϕ , ψ = ϕ , r ψ , 2 2 2 ϕ − ψ
= ϕ + ψ − 2ϕ , ψ R , ϕ , ψ ≤ ϕ
ψ (Cauchy–Schwarz inequality),
ϕ + ψ, ξ = ϕ , ξ + ψ, ξ , ϕ ⊗ ψ, ξ ⊗ ζ = ϕ , ξ ψ, ζ , ϕ ⊕ ψ, ξ ⊕ ζ = ϕ , ξ + ψ, ζ ,
ϕ ⊗ ψ = ϕ
ψ .
The angle between complex vectors ϕ and ψ is defined as (see, e.g., [25]):
ang(ϕ , ψ) = arccos
ϕ , ψ R .
ϕ
ψ
We say that ϕ is orthogonal to ψ if ϕ , ψ = 0. Two subspaces X , Y ⊆ Cn are orthogonal if any vector in X is orthogonal to any vector in Y . In this case, the linear space generated by X ∪ Y is denoted by X Y .
104
M.P. Bianchi et al. / Theoretical Computer Science 551 (2014) 102–115
A matrix M ∈ Cn×n is said to be unitary if M M † = I = M † M, where I ∈ Cn×n is the identity matrix. Equivalently, M is unitary if it preserves the norm, i.e., ϕ M = ϕ for any ϕ ∈ Cn . M is said to be Hermitian (or self-adjoint) if M = M†. For a Hermitian matrix O ∈ Cn×n , let c 1 , . . . , c s be its eigenvalues and E 1 , . . . , E s the corresponding eigenspaces. It is well known that each eigenvalue ck is real, E i is orthogonal to E j for every 1 ≤ i = j ≤ s, and E 1 · · · E s = Cn . Thus, every vector ϕ ∈ Cn can be uniquely decomposed as ϕ = ϕ1 + · · · + ϕs , for unique ϕ j ∈ E j . The linear transformation ϕ → ϕ j is the projector1 P j onto sthe subspace E j . Actually, the Hermitian matrix O is biunivocally determined by its eigenvalues and projectors as O = i =1 c i P i . Evolutions in quantum systems can be described by unitary matrices, while measurements can be described by Hermitian matrices. We recall that S ⊆ Cn is a compact set if and only if every infinite sequence of elements in S contains a convergent subsequence, whose limit lies in S. For a given vector ϕ ∈ Cn and a real positive value r, we define the set Br (ϕ ) = {ψ ∈ Cn | ψ − ϕ ≤ r } as the ball of radius r centered in ϕ . The ball Br (ϕ ) is an example of compact set in Cn . 2.2. Finite automata We assume familiarity with basic notions on formal language theory (see, e.g., [14]). The set of all words (including the empty word ε ) over a finite alphabet Σ is denoted by Σ ∗ , and the set of words of length n is denoted by Σ n . A deterministic finite automaton (dfa) is a 5-tuple D = Q , Σ, τ , q1 , F , where Q is the finite set of states, Σ is the finite input alphabet, q1 ∈ Q is the initial state, F ⊆ Q is the set of final (accepting) states, and τ : Q × Σ → Q is the transition function. An input word is accepted if the induced computation starting from the state q1 ends in some final state q ∈ F after consuming the whole input. The set of all words accepted by D is denoted by L D and called the accepted language. An alternative equivalent representation for the dfa D is by the 3-tuple D = α , { M (σ )}σ ∈Σ , β , where α ∈ {0, 1}| Q | is the characteristic row vector of the initial state, M (σ ) ∈ {0, 1}| Q |×| Q | is the boolean matrix satisfying ( M (σ ))i j = 1 if and only if τ (q i , σ ) = q j , and β ∈ {0, 1}| Q |×1 is the characteristic column vector of the final states. The accepted language can now be defined as
L D = y ∈ Σ ∗ α M ( y )β = 1 ,
n
where y = y 1 · · · yn ∈ Σ ∗ and M ( y ) = i =1 M ( y i ). Let us now introduce the model of quantum finite automata with control language [5,19]. Definition 1. Given an input alphabet Σ and an endmarker symbol ∈ / Σ , a quantum finite automaton with control language (qfc) with q quantum states is a system A = φ, {U (σ )}σ ∈Γ , O , L , for Γ = Σ ∪ { }, where
• φ ∈ Cq is the initial amplitude vector satisfying φ = 1, • U (σ ) ∈ Cq×q is a unitary matrix, for any σ ∈ Γ , • O = c∈C c P (c ) is a Hermitian matrix representing an observable, where the set C of eigenvalues of O is the set of all possible outcomes of measuring O , and P (c ) denotes the projector onto the eigenspace corresponding to c ∈ C , • L ⊆ C ∗ is a regular language, called the control language. An input for the qfc A is any word from Σ ∗ ended by the symbol . The behavior of A on x1 · · · xn ∈ Σ ∗ is as follows. At any time, the state of A is a vector ξ ∈ Cq with ξ = 1. The computation starts in the state φ , then transformations associated with the symbols in x1 · · · xn are applied in succession. Precisely, the transformation corresponding to a symbol σ ∈ Γ consists of two steps: (i) Evolution: the unitary operator U (σ ) is applied to the current state vector ξ of A, leading to the new state ξ . (ii) Measuring: the observable O is measured on ξ . According to quantum mechanics principles, the outcome of measureξ P (c )
ment is c with probability ξ P (c ) 2 and the state vector of A “collapses” to ξ P (c ) .
So, the computation on x1 · · · xn ∈ Σ ∗ yields a given sequence y 1 · · · yn y ∈ C ∗ of outcomes of measurements of O with probability p A ( y 1 · · · yn y ; x1 · · · xn ) defined as
n 2 p A ( y 1 · · · yn y ; x1 · · · xn ) = φ U (xi ) P ( y i ) U ( ) P ( y ) . i =1
1
A matrix P ∈ Cn×n is a projector if and only if P is Hermitian and idempotent (i.e., P 2 = P ).
M.P. Bianchi et al. / Theoretical Computer Science 551 (2014) 102–115
105
A computation yielding the word y 1 · · · yn y of measure outcomes is said to be accepting whenever y 1 · · · yn y ∈ L, otherwise it is rejecting. Hence, the probability that the qfc A exhibits an accepting computation on input x1 · · · xn is
EA (x1 · · · xn ) =
p A ( y 1 · · · yn y ; x1 · · · xn ).
y 1 ··· yn y ∈L
The function EA : Σ ∗ → [0, 1] is the stochastic event induced by A. The language accepted by A with cut point λ ∈ [0, 1] is the set of words
L A,λ = x ∈ Σ ∗ E A (x) > λ . The cut point λ is said to be isolated whenever there exists δ ∈ (0, 12 ] such that |EA (x) − λ| ≥ δ , for any x ∈ Σ ∗ . From [5,19], we know that the class of languages accepted with isolated cut point by qfcs coincides with the class of regular languages. The following property on the evolution of qfcs is shown in [5], and it will turn out to be useful in the sequel: Lemma 2. Let A = φ, {U (σ )}σ ∈Γ , O = any word σ1 · · · σn ∈ Γ ∗ , we have
y 1 ··· yn ∈C
c ∈C
c P (c ), L be a qfc with q quantum states. Then, for any complex vector ϕ ∈ Cq and
2 n U (σ j ) P ( y j ) = ϕ 2 . ϕ n j =1
From an architectural point of view, a qfc may be seen as a hybrid system incorporating both a quantum and a classical component. The former, namely a quantum finite automaton, provides the system quantum evolution together with a sequence of measurement outcomes obtained by observing the system after each input symbol processing. The latter, namely a deterministic finite automaton, processes the observation outcome sequence, and checks whether it leads to acceptance, i.e., whether the observation outcome sequence is a member of the control language L. If this is the case, the qfc accepts the input word, otherwise it rejects. In this sense, the language L (and hence the corresponding deterministic finite automaton) “controls” the accept/reject behavior of the whole system. From this picture, we get that when referring to the size of a qfc A, we must account for both the quantum and the classical component. Hence, in what follows, we say that A has q quantum states and k classical states whenever A is a qfc with q quantum states and the control language L is recognized by a k-state deterministic finite automaton. Throughout the paper, we say that two automata are equivalent whenever they accept the same language. 3. Converting QFCs to DFAs We start by defining a matrix representation for qfcs. From this representation, for any given qfc, we construct an equivalent dfa by suitably generalizing Rabin’s technique. Finally, we analyze the state complexity of the resulting dfa with respect to the size of the original qfc. 3.1. Linear representation of qfcs A convenient way to work with qfcs is by using their linear representation [5]. Let A = φ, {U (σ )}σ ∈Γ , O = c ∈C c P (c ), L be a qfc with δ -isolated cut point λ, and let D = α , { M (c )}c ∈C , β be the minimal dfa recognizing L. Denote by q and k the number of quantum and classical states of A, respectively. We define the linear representation of A to be the 3-tuple Li(A) = ϕ0 , { V (σ )}σ ∈Γ , η where:
• ϕ0 = (φ ⊗ φ ∗ ⊗ α ) is a row vector in Cq k , 2 2 • V (σ ) = (U (σ ) ⊗ U ∗ (σ ) ⊗ I ) · c∈C P (c ) ⊗ P ∗ (c ) ⊗ M (c ) is a matrix in Cq k×q k , for any σ ∈ Γ , with I being the k × k 2
identity matrix, q 2 e ⊗ e j ⊗ β is a column vector in {0, 1}q k×1 , with e j ∈ {0, 1}q×1 being the column vector with 1 in its jth j =1 j entry and 0 elsewhere.
• η=
A relevant property for Li(A) is that it allows to represent EA : Σ ∗ → [0, 1], the stochastic event induced by A (see Section 2.2), as
EA (x) = ϕ0 V (σ1 ) · · · V (σn )η,
for x = σ1 · · · σn ∈ Σ ∗ .
106
M.P. Bianchi et al. / Theoretical Computer Science 551 (2014) 102–115
In fact
ϕ0 V (σ1 ) · · · V (σn )η =
q
φ
j =1 y = y 1 ... yn ∈C n
=
y = y 1 ... yn ∈C
=
n
U (σi ) P ( y i )
i =1
φ
∗
n
i =1
2 n U (σi ) P ( y i ) = EA (x). φ
y 1 ... yn ∈L
∗
U (σi ) P ( y i )
i =1
j
n q α M ( y )β φ U (σi ) P ( y i ) n j =1
∗
j
α M ( y )β j
2
i =1
Moreover, we will use the fact that
ϕ0 V (σ1 ) · · · V (σn ) ≤ 1,
for σ1 · · · σn ∈ Γ ∗ .
This can be proved as follows (recall that
α M ( y ) = 1):
α M ( y ) represents the state vector of the dfa D after processing y, and hence
U (σ j ) P ( y j ) ⊗ φ U (σ j ) P ( y j ) ⊗ α M ( y ) φ y = y 1 ... yn ∈C n j =1 j =1 n n ∗ U (σ j ) P ( y j ) ⊗ φ U (σ j ) P ( y j ) ⊗ α M ( y ) ≤ φ n
ϕ0 V (σ1 ) · · · V (σn ) =
y = y 1 ... yn ∈C
=
y = y 1 ... yn ∈C
n
∗
j =1
n
∗
∗
j =1
2 n U (σ j ) P ( y j ) α M ( y ) = φ 2 = 1 (by Lemma 2). φ n j =1
2
2
Therefore, all the state vectors of Li(A) lie within the unitary ball B1 (0) ⊂ Cq k centered in the zero-vector 0 ∈ Cq k . We are now going to show a crucial result saying, roughly speaking, that any input word ω induces an evolution in Li(A) which increases the distance between two different starting vectors only by a constant factor which does not depend on the length of ω . To this aim, we first need two technical lemmas stating properties of vectors lying within unitary balls. For the sake of conciseness and when no confusion arises, we will simply write B1 to denote a unitary ball centered in the zero-vector 0, regardless the dimension of the space within which such a ball is embedded. Yet, in the following proofs, we will be heavily using properties of vectors and matrices listed in Section 2.1.
v
Lemma 3. For any v , v ∈ B1 satisfying v ≥ v and cos(ang( v , v )) ≥ 0, let r = v . Then,
v − v r ≥ v − r v .
Proof. By squaring both sides of the inequality, we have
2 2 r 2 v + r 2 v 2 − 2r 2 v , v R ≥ v + r 2 v 2 − 2r v , v R , holding true if and only if (r 2 − 1) v 2 − 2(r 2 − r ) v , v R ≥ 0. By hypothesis, we have r ≥ 1, v = cos(ang( v , v )) ≤ 1. Therefore, we get
2 2 r 2 − 1 v − 2 r 2 − r v , v R = r 2 − 1 v − 2 r 2 − r v v cos ang v , v
2 2 ≥ r 2 − 1 v − 2(r − 1) v 2 = (r − 1)2 v .
Clearly, (r − 1)2 v 2 ≥ 0, and this completes the proof.
2
Lemma 4. For any v , v ∈ B1 satisfying v ≥ v and cos(ang( v , v )) ≥ 0, we have
cos ang v − v , v
1
≥ −√ . 2
v
r
, and 0 ≤
M.P. Bianchi et al. / Theoretical Computer Science 551 (2014) 102–115
107
Fig. 1. The 2-dimensional section of B1 containing the origin, and the vectors v and v . The vector v can only lie within the white part: the gray part is forbidden by the constraint v ≥ v , while the part with the vertical lines is forbidden by the constraint cos(ang( v , v )) ≥ 0. Clearly, the widest angle θ between v − v and v is attained when v = v , and the angle between v and v is π4 . In this case, we have θ = 34π , thus yielding cos(θ) = − √1 . 2
Note. To help intuition, the reader may visualize the property stated in this lemma by focusing on real vectors, and considering the 2-dimensional section of B1 containing the origin, and the vectors v and v . This situation is shown in Fig. 1. Proof. Clearly, the statement holds for v = 0. So, let us assume v > 0. We first show that the lowest possible value
v
of cos(ang( v − v , v )) is attained whenever v = v . In other terms, by letting r = v ≥ 1, θ = ang( v − v , v ), and θ = ang( v − r v , r v ), we show that cos(θ) ≥ cos(θ ). By definition of complex angle (see Section 2.1), this inequality can be rewritten as
cos(θ) =
v − v , v R v − r v , r v R ≥ = cos θ ,
v − v
v v − r v r v
holding true if and only if the following inequality holds:
v
v − r v , v R v
v − v , v R ≤ .
v − v r
v − r v
(1)
Now, notice that: (i) v
v − v , v R ≥ v
v − r v , v R since r ≥ 1, and (ii) v − v r ≥ v − r v from Lemma 3. These two properties, together with the fact that v
v − v , v R ≥ 0, suffice to prove inequality (1), and thus the claim cos(θ) ≥ cos(θ ). So, we assume v = v and, for the sake of readability, we let θˆ = ang( v , v ). We have
cos(θ) =
v − v , v R v , v R − v 2
v (cos(θˆ ) − 1) = = .
v − v
v
v − v
v
v − v
(2)
Moreover, we get
v − v = v − v 2 = v 2 + v 2 − 2 v
v cos(θ) ˆ = 2 v 2 1 − cos(θˆ ) ˆ . = v 2 1 − cos(θ)
(3)
Now, by recalling that 0 ≤ cos(θˆ ) ≤ 1 and considering Eqs. (2) and (3), we obtain
cos(θ) =
v (cos(θˆ ) − 1) 1 ≥ −√ . 2
v 2(1 − cos(θˆ ))
2
We are now ready to prove the claimed crucial result on the distance between trajectories in Li(A):
108
M.P. Bianchi et al. / Theoretical Computer Science 551 (2014) 102–115
∗
Lemma 5. For any state vectors ϕ = v ⊗ v ∗ ⊗ a and ϕ = v ⊗ v ⊗ a of Li(A) = ϕ0 , { V (σ )}σ ∈Γ , η , and any ω = σ1 · · · σn ∈ Γ ∗ , we have
ϕ V (ω) − ϕ V (ω) ≤ 4ϕ − ϕ , where V (ω) =
n
i =1
(4)
V (σi ).
Proof. We consider the case in which a = a , and quickly address the opposite case at the end of the proof. Recall that a and a are row vectors in {0, 1}k representing the state of the dfa D accepting the control language of A, and hence exhibiting 1 at a certain entry and 0 elsewhere. Without loss of generality, we assume v ≥ v . Moreover, we assume cos(ang( v , v )) ≥ 0. Otherwise, we can consider the vector − v instead of v , for which cos(ang(− v , v )) ≥ 0 holds true, and the proof works unchanged since we have (− v ) ⊗ (− v )∗ ⊗ a = v ⊗ v ∗ ⊗ a = ϕ . By letting Δ = v − v, we have ϕ − ϕ = v ⊗ Δ∗ ⊗ a + Δ ⊗ v ∗ ⊗ a + Δ ⊗ Δ∗ ⊗ a. So, we can rewrite the left side of inequality (4) as
ϕ − ϕ V (ω) = v ⊗ Δ∗ ⊗ a V (ω) + Δ ⊗ v ∗ ⊗ a V (ω) + Δ ⊗ Δ∗ ⊗ a V (ω) ≤ v ⊗ Δ∗ ⊗ a V (ω) + Δ ⊗ v ∗ ⊗ a V (ω) + Δ ⊗ Δ∗ ⊗ a V (ω).
(5)
To simplify inequality (5), we analyze the generic form ( v 1 ⊗ v 2 ∗ ⊗ a) V (ω) , which can be written as
v1
y = y 1 ··· yn ∈C n
n
U (σ j ) P ( y j ) ⊗ v 2
j =1
∗
U (σ j ) P ( y j ) ⊗ aM ( y ).
n
∗
j =1
∗
Notice that aM ( y ) is a state vector of the dfa D for the control language of A. Hence, as previously recalled, we have
aM ( y ) = 1, and we can write
v 1 ⊗ v 2 ∗ ⊗ a V (ω) ≤
y 1 ··· yn ∈C
n n ∗ ∗ ∗ U (σ j ) P ( y j ) · v 2 U (σ j ) P ( y j ). v 1 n j =1
j =1
The right side of this inequality can be regardedto as the inner product between vˆ 1 , vˆ 2 of dimension |C |n , n two∗ vectors n ∗ ∗ ˆ ˆ with the yth entry of v 1 (resp., v 2 ) being v 1 j =1 U (σ j ) P ( y j ) (resp., v 2 j =1 U (σ j ) P ( y j ) ). By Cauchy–Schwarz inequality, we have | vˆ 1 , vˆ 2 | ≤ vˆ 1
vˆ 2 . So, we can write
v 1 ⊗ v 2 ∗ ⊗ a V (ω) ≤
2 n U (σ j ) P ( y j ) · v 1 n
y 1 ··· yn ∈C
j =1
y 1 ··· yn ∈C
2 n ∗ ∗ ∗ U (σ j ) P ( y j ) v 2 n
= v 1 2 v 2 2 = v 1
v 2 (by Lemma 2).
j =1
(6)
By replacing v 1 and v 2 with the vectors involved in inequality (5), we obtain
ϕ V (ω) − ϕ V (ω) ≤ 2 v
Δ + Δ 2 ,
(7)
thus bounding the left side of inequality (4). Let us now focus on right side of inequality (4), and observe that
ϕ − ϕ 2 = v ⊗ Δ∗ + Δ ⊗ v ∗ + Δ ⊗ Δ∗ 2 since a = 1 = v 2 Δ 2 + Δ 2 v 2 + Δ 2 Δ 2 + 2 v , Δ Δ∗ , v ∗ R + 2 v , Δ Δ∗ , Δ∗ R + 2 Δ, Δ v ∗ , Δ∗ R = v 2 Δ 2 + Δ 2 v 2 + Δ 2 Δ 2 2 + 2 v , Δ + 2 v , Δ Δ 2 R + 2 Δ 2 v ∗ , Δ∗ R 2 ≥ 2 v 2 Δ 2 + Δ 4 + 2 v , Δ R + 4 Δ 2 v , Δ R . Let θ = ang( v , Δ). We have
ϕ − ϕ 2 ≥ 2 v 2 Δ 2 + Δ 4 + 2 v 2 Δ 2 cos(θ) 2 + 4 v
Δ 3 cos(θ).
(8)
M.P. Bianchi et al. / Theoretical Computer Science 551 (2014) 102–115
109
Fig. 2. The transition τ of the dfa D A on a symbol σ ∈ Σ . The big dots represent state vectors of Li(A), while the ellipses indicate equivalence classes of ∼. The smaller dots between ϕωˆ j σ and rep[ϕωˆ j σ ]∼ represent the state vectors at distance smaller than δ , witnessing the relation ∼ between them. 2 qk
The dashed arrow indicates the original evolution on Li(A), while the full arrow represents the behavior of D A .
Summing up, by considering inequalities (7) and (8), in order to prove the desired inequality (4) it then suffices to show that
2 v
Δ + Δ 2
2
2 . ≤ 16 Δ 4 + 4 v
Δ 3 cos(θ) + 2 v 2 Δ 2 1 + cos(θ)
We can divide both sides by Δ 2 , since for Δ = 0 this inequality is trivially verified. Then, by solving with respect to
Δ , we get that the inequality is always true if
4 v 2 16 cos(θ) − 1
2
2 − 60 v 2 8 cos(θ) + 7 ≤ 0
holds. If v = 0, this is clearly verified. Otherwise, dividing by v 2 and routine manipulation lead us to study the equivalent inequality
2
17 cos(θ)
− 4 cos(θ) − 13 ≤ 0.
(9)
Recall that, at the beginning of the proof, we assumed that v ≥ v and that cos(ang( v , v )) ≥ 0. So, by Lemma 4, we get − √1 ≤ cos(θ) ≤ 1. Within this interval, the left side of inequality (9) is never positive, whence the result follows. 2
We are now left to address the case a = a . As recalled, a and a are state vectors of the dfa D for the control language of A. This obviously gives that a = a implies a, a = 0. As a consequence, one may easily obtain ϕ − ϕ 2 = v 4 + v 4 . Moreover, by using the same reasoning that lead to inequality (6), we get (ϕ − ϕ ) V (ω) ≤ v 2 + v 2 . Therefore, we have
ϕ − ϕ V (ω)2 ≤ v 4 + v 4 + 2 v 2 v 2 ≤ 3 v 4 + v 4 = 3ϕ − ϕ 2 , and the claimed result again follows.
2
3.2. Conversion to dfas We are now ready to construct a dfa D A equivalent to the qfc A, by using the linear representation Li(A) = ϕ0 , { V (σ )}σ ∈Γ , η , with Γ = Σ ∪ { }. n For any word ω = σ1 · · · σn ∈ Σ ∗ , let ϕω = ϕ0 i =1 V (σi ) = ϕ0 V (ω) be the state vector reached by Li(A) after reading ω . 2 As proved in Section 3.1, we have that ϕω lies within the unitary ball B1 (0) ⊂ Cq k . We define the relation ∼ on the set 2 {ϕω | ω ∈ Σ ∗ } ⊆ B1 (0) ⊂ Cq k as:
there exists a sequence of words ω1 , ω2 , . . . , ωn ∈ Σ ∗ ϕω ∼ ϕω ⇐⇒ satisfying ω = ω , ω = ω , and ϕ − ϕ < δ . 1 n ωi ω i +1 2 qk It is easy to verify that ∼ is an equivalence relation and that the distance between two vectors belonging to different equivalence classes is at least δ . This latter fact shows that ∼ is of finite index since otherwise, by taking one vector 2 qk
2
from each class, one could construct an infinite sequence of elements in B1 (0) ⊂ Cq k which cannot have any convergent 2 subsequence, against the compactness of B1 (0) ⊂ Cq k (see Section 2.1). Therefore, by letting s be the index of ∼, we choose a representative for each equivalence class, and call these representatives ϕωˆ 1 , ϕωˆ 2 , . . . , ϕωˆ s . In addition, for any word ω ∈ Σ ∗ , we let rep[ϕω ]∼ denote the representative of the equivalence class the state vector ϕω belongs to. We construct our dfa D A as follows:
• • • •
the the the the top
set of states is the set of representatives {ϕωˆ 1 , ϕωˆ 2 , . . . , ϕωˆ s }, input alphabet is Σ , initial state is the vector rep[ϕε ]∼ , which we assume to be ϕωˆ 1 , transition function is defined, for any σ ∈ Σ , as τ (ϕωˆ j , σ ) = rep[ϕωˆ j σ ]∼ (a step of of the page),
τ is intuitively shown in Fig. 2,
110
M.P. Bianchi et al. / Theoretical Computer Science 551 (2014) 102–115
Fig. 3. The evolution scheme of the computation over the word z . The full arrows describe the transitions of the dfa D A , while the snake arrows denote the evolution in Li(A) from each vector γ j ,t in the equivalence class reached after j symbols, through the dynamic V over the remaining suffix z{− j } , leading to the vector γ j ,t V ( z{− j } ) in the bottom chain. In this bottom chain, the leftmost point denotes the vector reached by Li(A) after reading z , while the rightmost point is the state reached by D A after reading z, plus a final transition of Li(A) on . Intuitively, the correctness of D A comes from the fact that all the vectors in the bottom chain are sufficiently close to their neighbors to represent either all accepting or all rejecting quantum states in the original qfc A.
• the final states are the representatives {ϕωˆ j | ϕωˆ j V ( )η ≥ λ + δ} associated with words accepted in the original qfc A. Equivalently, one could say that ϕωˆ j is final if and only if its equivalence class contains ϕω for some word ω accepted by A. Before showing the correctness of our construction, we stress the fact that the equivalence relation ∼ is not a congruence (in fact, ϕω ∼ ϕω does not necessarily imply ϕωσ ∼ ϕω σ for σ ∈ Σ , as the reader may easily verify). So, the fact that D A is equivalent to A does not come straightforwardly as in Rabin’s setting, but we need an explicit proof: Theorem 6. D A is equivalent to A. Proof. We begin by introducing some notation:
• For a word z = z1 z2 · · · zn ∈ Σ ∗ , we let z{ j } = z1 z2 · · · z j be the prefix of z of length j, and z{− j } = z j +1 zi +2 · · · zn be the remaining suffix.
• We let ρ j = τ (ϕωˆ 1 , z{ j } ) be the state reached by D A after reading the first j symbols of z. So, ρ0 = ϕωˆ 1 is the initial state of D A .
• We let ψ j = ρ j −1 V (z j ) be the state vector reached by j − 1 steps of D A followed by one step of Li(A). So, ψ0 = ϕ0 is the initial state of Li(A). Notice that, for each 0 ≤ j ≤ n, we have ψ j ∼ ρ j since ρ j = rep[ψ j ]∼ . Moreover, by definition, the vectors witnessing ψ j ∼ ρ j are reachable in Li(A). Formally: there exists a sequence ψ j = γ j ,1 , γ j ,2 , . . . γ j , j = ρ j satisfying γ j ,i − γ j ,i +1 < δ , and there exist x j ,t ∈ Σ ∗ such that ϕ0 V (x j ,t ) = γ j ,t for 1 ≤ t ≤ j . As a consequence of Lemma 5, for every 0 ≤ j ≤ n and 1 ≤ t ≤ j , we have
γ j ,t V ( z{− j } ) − γ j ,(t +1) V ( z{− j } ) < 4 · δ
2 qk
2δ
=
qk
2 qk
(10)
.
In addition, since
ρ j V (z{− j} ) = ψ j+1 V (z{−( j+1)} ), for all j’s, inequality (10) implies that the vectors ρ j V ( z{− j } ) form a chain of vectors from the final state vector ϕ0 V ( z ) of Li(A) to the vector ρn V ( ), where the distance between each pair of consecutive vectors is strictly smaller than 2δ . This qk
is intuitively shown in Fig. 3, top of page. We first show that z ∈ L A,λ ⇒ τ (ϕωˆ 1 , z) ∈ F , which is equivalent to showing
ϕ0 V (z )η ≥ λ + δ ⇒ ρn V ( )η ≥ λ + δ.
(11)
Note that ϕ0 = γ0,1 , ρn = γn,n , and that, for 0 ≤ j ≤ n and 1 ≤ t ≤ j , all γ j ,t ’s witnessing the relation ∼ are reachable in Li(A) through some word x j ,t ∈ Σ ∗ , i.e., γ j ,t V ( z{− j } ) = ϕ0 V (x j ,t · z{− j } ). Since λ is a δ -isolated cut point, we have
γ j,t V (z{− j} )η
≥ λ + δ if x j ,t z{− j } ∈ L A,λ , / L A,λ . ≤ λ − δ if x j ,t z{− j } ∈
M.P. Bianchi et al. / Theoretical Computer Science 551 (2014) 102–115
111
Assume, by contradiction, that inequality (11) does not hold. Then, there exists a position in the bottom chain of Fig. 3 where the acceptance probability associated with a state vector in the chain is above the cut point, while the acceptance probability associated with its right neighbor is below the cut point. More formally, there must exist ι, κ such that
γι,κ V (z{−ι} )η ≥ λ + δ and γι,(κ +1) V (z{−ι} )η ≤ λ − δ. From these two inequalities and by observing that η ≤
qk, we get
2δ ≤ γι,κ V ( z{−ι} ) − γι,(κ +1) V ( z{−ι} ) η ≤ γι,κ V ( z{−ι} ) − γι,(κ +1) V ( z{−ι} ) η
≤ γι,κ V ( z{−ι} ) − γι,(κ +1) V ( z{−ι} ) · qk 2δ < · qk = 2δ (by inequality (10)), qk
which is an absurdum.
/ L A ⇒ τ (ϕωˆ 1 , z) ∈ / F , and this completes the proof. Symmetrically, one can show that z ∈
2
3.3. Size cost of the conversion We now analyze the cost, in terms of number of states, of the above conversion from qfcs to dfas. This will enable us to obtain a general gap at most exponential between the succinctness of the quantum and classical paradigm. Theorem 7. For any given qfc A with q quantum states, k classical states, and δ -isolated cut point, there exists an equivalent dfa D A with s states satisfying
s≤ 1+
4 qk
q2 k .
δ
Proof. Let Li(A) = ϕ0 , { V (σ )}σ ∈Γ , η be the linear representation of A. As observed in Section 3.1, its state vectors lie within the ball B1 (0) ⊂ Cd , for d = q2 k. When constructing the equivalent dfa D A as described in Section 3.2, the number s of states of D A coincides with the number of the equivalence classes of the relation ∼. To estimate s, for 1 ≤ i ≤ s, consider the ball B δ (ϕωˆ i ) ⊂ Cd centered in the representative ϕωˆ i of the ith equivalence 4 qk
classes of ∼. Clearly, such a ball is disjoint from the analogous ball centered in
ϕωˆ j , for every 1 ≤ i = j ≤ s. Moreover, all such
balls are contained in the ball B1+ δ (0) ⊂ Cd , and their number is exactly the number s of the equivalence classes of ∼. 4 qk Since the volume of a d-dimensional ball of radius r is K r d , for a suitable constant K depending on d, there exist at most
K (1 + δ/4 qk)d
K (δ/4 qk)d disjoint balls of radius
= 1+ δ
4 qk
4 qk
q2 k
δ
in B1+ δ (0). So, this number is an upper bound for s, and the claim follows. 4 qk
2
4. Size lower bound for quantum paradigms By using the inequality of Theorem 7 “the other way around”, we are able to state lower limits to the descriptional power of qfcs. Theorem 8. Any qfc with q quantum states, k classical states, and δ -isolated cut point accepting a regular language whose minimal dfa has μ states, must satisfy
qk ≥
log(μ)
49
log( 5δ )
.
Proof. From any qfc, we can obtain an equivalent dfa with a number of states bounded as in Theorem 7. Thus, for δ ∈ (0, 12 ] and q, k ≥ 1, we have
μ≤ 1+
4 qk
q2 k
δ
whence the result follows.
≤ 2
5 qk
δ
q2 k ≤
4 qk·q2 k 5
δ
(qk) 94 ≤
5
δ
,
112
M.P. Bianchi et al. / Theoretical Computer Science 551 (2014) 102–115
The lower bound in Theorem 8 is not only interesting in the world of qfcs, but it turns out to have several applications in the world of quantum automata. In fact, qfcs represent a general unifying framework within which several types of quantum automata may directly and naturally be replicated. This implies, e.g., that the size lower bound in Theorem 8 may transfer to the simulated quantum automata models. So, we are now going to quickly recall some known simulation results, and then focus on new quantum paradigm simulations via qfcs. We begin by displaying some simulation results in [5], where it is proved that: Proposition 9.
• Any q-state measure-once quantum finite automaton (mo-qfa) can be simulated by a qfc with 2q quantum states and 1 classical state.
• Any q-state measure-many quantum finite automaton (mm-qfa) can be simulated by a qfc with q quantum states and 3 classical states.
• Any q-state quantum reversible automaton (qra) can be simulated by a qfc with q quantum states and 2 classical states. Here, we show how qfcs can simulate another interesting model of quantum automaton, called Latvian quantum finite automaton [2], where the evolution and measurement steps both depend on the input symbol.
/ Σ , a q-state Latvian quantum finite automaton (lqfa) Definition 10. Given an input alphabet Σ and an endmarker symbol ∈ is a system Ω = ϕ0 , {U (σ )}σ ∈Γ , {Oσ }σ ∈Γ , F , for Γ = Σ ∪ { }, where • ϕ0 ∈ Cq is the initial amplitude vector satisfying ϕ0 = 1, • U (σ ) ∈ Cq×q is the unitary evolution, for any σ ∈ Γ , • Oσ = ki =σ 0−1 c i (σ ) P i (σ ) is an observable on Cq , for any σ ∈ Γ ; the set of all possible outcomes of measuring Oσ is denoted by C (Oσ ) = {c 0 (σ ), . . . , ckσ −1 (σ )}, • F ⊆ C (O ) is the set of final (accepting) outcomes. The stochastic event induced by Ω is defined to be the function p Ω : Σ ∗ → [0, 1] where, for any word have that p Ω (ω) expresses the probability of Ω accepting ω as
p Ω (ω) =
k σ 1 −1
ω = σ1 · · · σn ∈ Σ ∗ , we
kσn −1
···
{0≤ f ≤k −1|c f ( )∈ F } i 1 =0
ϕ0 U (σ1 ) P i (σ1 ) · · · U ( ) P f ( )2 . 1
i n =0
It is worth remarking that a particular case of a lqfa is given by a measure-only quantum finite automaton (mon-qfa) [8], where only measurement operations are allowed as computational steps on input symbols. In other terms, a mon-qfa is a lqfa where U (σ ) = I , for any σ ∈ Γ . In [2], it is proved that lqfas with bounded error (i.e., with isolated cut point 12 ) recognize exactly those languages whose syntactic monoid is a block group [23]. On the other hand, as shown in [11], mon-qfas with isolated cut point recognize exactly the class of literally idempotent piecewise testable languages [16,23]. We are now ready to show how to simulate any given lqfa (and hence also a mon-qfa) by a qfc. In what follows, for any n ∈ N, we denote by [0]n the zero-matrix in Cn×n , and by 0n the zero-vector in Cn . Theorem 11. For any q-state lqfa (or mon-qfa) Ω there exists a qfc A with at most 2q2 quantum states and q classical states such that EA = p Ω . Proof. We show the conversion for lqfas, and quickly address mon-qfas at the end of the proof. Let the lqfa
k σ −1
Ω = ϕ , U (σ ) σ ∈Γ , Oσ = c i (σ ) P i (σ )
!
i =0
,F . σ ∈Γ
Without loss of generality and for the sake of simplicity, we can assume F = {c 0 ( )}. So, the event induced by Ω on input ω = σ1 · · · σn writes as k σ 1 −1
p Ω (ω) =
i 1 =0
2 n ··· U (σ j ) P i j (σ j ) U ( ) P 0 ( ) . ϕ kσn −1 i n =0
j =1
We let K = max {kσ | σ ∈ Γ }, and for each
σ ∈ Γ , we define the K q × K q matrix
(12)
M.P. Bianchi et al. / Theoretical Computer Science 551 (2014) 102–115
⎛
U (σ ) P 0 (σ ) U (σ ) P 1 (σ ) ⎜ U (σ ) P 1 (σ ) U (σ ) P 2 (σ ) ⎜ S (σ ) = ⎜ .. .. ⎝ . . U (σ ) P K −1 (σ ) U (σ ) P 0 (σ )
113
⎞
· · · U (σ ) P K −2 (σ ) U (σ ) P K −1 (σ ) · · · U (σ ) P K −1 (σ ) U (σ ) P 0 (σ ) ⎟ ⎟ ⎟, .. .. .. ⎠ . . . · · · U (σ ) P K −3 (σ ) U (σ ) P K −2 (σ )
where P j (σ ) = [0]q for kσ ≤ j ≤ K − 1. It is not difficult to see that S (σ ) is unitary. In fact, when computing S (σ ) S † (σ ) by block multiplication, we obtain blocks of the form
K −1 i =0
†
K −1
U (σ ) P i (σ ) P i (σ )U † (σ ) = U (σ )(
i =0
P i (σ ))U † (σ ) = I on the
main diagonal, while in the other positions we have a sum of products of the form U (σ ) P i (σ ) P j (σ )U † (σ ) with i = j, yielding the zero-matrix [0]q since different projectors are mutually orthogonal. Symmetrically, we get S † (σ ) S (σ ) = I as well. We are now ready to exhibit the equivalent qfc as
! K −1 K −1
A = φ, M (σ ) σ ∈Γ , Oˆ = ai Pˆ i + b i Pˆ K +i , L ,
i =0
i =0
where
• φ = ϕ ⊕ 0q(2K −1) ∈ C2K q is a row vector consisting of 2K blocks, each of dimension q, S (σ ) [0] K q I Kq
• for σ ∈ Σ , M (σ ) = ( [0]
), and M ( ) = (
[0] K q S ( ) ), I [0] K q
• for 0 ≤ j ≤ 2K − 1, Pˆ j = [0] jq ⊕ I ⊕ [0](2K − j −1)q is the projection that erases all blocks in a state vector of A except the ( j + 1)th, • L is the control language recognized by the dfa
D = {s0 , s1 , . . . , s K −1 },
K" −1
!
{ai , bi }, τ , s0 , {s0 }
i =0
such that, for 0 ≤ i , j ≤ K − 1, we set
τ (s i , σ ) =
sj s(i + j ) mod K
if σ = a j , if σ = b j .
Intuitively, the state vector of A is composed by two halves, each consisting of K blocks of dimension q. At each step, M (σ ) simultaneously performs all kσ projections in Ω associated with the symbol σ ∈ Σ , and stores each result in the blocks of the first half of the state vector. The endmarker evolution M ( ) does the same, but stores the results in the blocks of the second half. Since the outcomes of projections on the first half are of the form ai , while the outcomes of projections on the second half are of the form b i , the dfa D can realize when the quantum part of the qfc has read the symbol . Let us now show that EA = p Ω . The probability of A accepting the word ω = σ1 σ2 · · · σn is
EA (ω) =
2 n M (σ j ) Pˆ i j (σ ) M ( ) Pˆ f (σ ) . φ
(13)
j =1
ai 1 ···ain b f ∈L
The quantum computation step performed by the matrix M (σ ) Pˆ s (σ ) acts on the (r + 1)th block of the current state vector of A by transforming it with the matrix U (σ ) P (r +s) mod K (σ ), and then moving it into the (s + 1)th block. Formally, for ξ ∈ Cq , we have
(0rq ⊕ ξ ⊕ 0(2K −r −1)q ) M (σ ) Pˆ s (σ ) = 0sq ⊕ ξ U (σ ) P (r +s) mod K (σ ) ⊕ 0(2K −s−1)q .
(14)
After the computation of A on ω , the accepting condition is that the sequence of outcomes ai 1 · · · ain b f belongs to L, i.e., it satisfies (in + f ) mod K = 0. From Eq. (14) it is easy to see that, at any step along the computation of A, the nonzero entries of the state vector are always in a single block. So, when evaluating EA (ω), we have only to focus on the single block with the nonzero entries. Therefore, we can write (13) as:
EA ( ω ) =
=
K −1 K −1
···
K −1
i 1 =0 i 2 =0
in =0 {0≤ f ≤ K −1|(in + f ) mod K =0}
K −1 K −1
K −1
i 1 =0 i 2 =0
···
2 n U (σ j ) P (i j−1 +i j ) mod K (σ j ) U ( ) P (in + f ) mod K ( ) ϕ U (σ1 ) P i 1 (σ1 ) j =2
2 n U (σ j ) P (i j−1 +i j ) mod K (σ j ) U ( ) P 0 ( ) . ϕ U (σ1 ) P i 1 (σ1 )
i n =0
j =2
114
M.P. Bianchi et al. / Theoretical Computer Science 551 (2014) 102–115
Notice that, for any given i j −1 , we have {(i j −1 + i j ) mod K | 0 ≤ i j ≤ K − 1} = {0, 1, . . . , K − 1}. By rearranging terms in above last multiple sum, we get:
EA (ω) =
K −1 K −1 i 1 =0 i 2 =0
···
2 n U (σ j ) P i j (σ j ) U ( ) P 0 ( ) , ϕ U (σ1 ) P i 1 (σ1 )
K −1 i n =0
j =2
and since P j (σ ) = 0 for kσ ≤ j ≤ K − 1, we obtain k σ 1 −1
EA (ω) =
i 1 =0
2 n ··· U (σ j ) P i j (σ j ) U ( ) P 0 ( ) = p Ω (ω), ϕ kσn −1 i n =0
j =1
as displayed in (12). So, the qfc A simulates the lqfa Ω using 2qK ≤ 2q2 quantum and K ≤ q classical states. Finally, concerning the simulation of mon-qfas via qfcs, we already observed that a mon-qfa is basically a lqfa where all evolutions U (σ ) are set to be the identity matrix. This obviously implies that our simulation of lqfas by qfcs may work unaltered for mon-qfas. 2 In conclusion, by simulation results in Proposition 9 and Theorem 11, and considering the size lower bound for qfcs in Theorem 8, one immediately gets Theorem 12. Let L be a language recognized by a μ-state minimal dfa. Then 4
• any mo-qfa or qra for L needs at least 12 ( log(μ5 ) ) 9 states, log( δ )
• any mm-qfa for L needs at least
4
1 log(μ) 9 ( ) 3 log( 5 )
states,
δ
4
• any lqfa or mon-qfa for L needs at least ( 51 log(μ5 ) ) 27 states. log( δ )
5. Some concluding remarks and future research hints In this paper, we have focused on the model of qfcs, comparing their descriptional power with that of classical automata. We have proved (Theorem 8) that qfcs can be (at most) exponentially smaller than equivalent dfas. This property is not only interesting per se, but transfers to several types of qfas due to qfcs simulation capabilities. The size lower bounds for different models of qfas are summarized in Theorem 12. It should be noted that a better asymptotically optimal lower bound of log (μ)/(2 log(1 + 2/δ)) is obtained in [7] for mo-qfas. There, however, Rabin’s approach has a direct application since on mo-qfas the equivalence relation determining the states of the equivalent dfa is in fact a congruence, and the correctness of the dfa is straightforward. In the case of qfcs, instead, the equivalence relation ∼ is not a congruence, so we have had to ensure that, starting from two different state vectors in the same equivalence class, after the evolution on the same word, the two resulting vectors are either both accepting or both rejecting, even if they belong to different equivalence classes. This has been possible thanks to the property proved in Lemma 5. As natural open problems, it remains either to witness the optimality of our size lower bound for qfcs, or to improve it, especially for the particular cases of simulated machines such as, e.g., mm-qfas, qras, lqfas, and mon-qfas. Again, it would also be interesting to investigate the simulation of other quantum paradigms (e.g., hybrid systems) within the framework of qfcs, in order to get further size lower bounds. Alternatively, one could study size lower bounds by performing our Rabin-like analysis directly on those paradigms. Acknowledgements We thank Alberto Bertoni for all these years of friendship and research. We also wish to thank the anonymous referee for careful reading and suggestions. References [1] F. Ablayev, A. Gainutdinova, On the lower bounds for one-way quantum automata, in: Proc. 25th Int. Symposium on Mathematical Foundations of Computer Science (MFCS 2000), in: Lecture Notes in Comput. Sci., vol. 1893, Springer, 2000, pp. 132–140. [2] A. Ambainis, M. Beaudry, M. Golovkins, A. Kikusts, M. Mercer, D. Thérien, Algebraic results on quantum automata, Theory Comput. Syst. 39 (2006) 165–188. [3] A. Ambainis, R. Freivalds, 1-Way quantum finite automata: strengths, weaknesses and generalizations, in: Proc. 39th Symposium on Foundations of Computer Science (FOCS 1998), 1998, pp. 332–342.
M.P. Bianchi et al. / Theoretical Computer Science 551 (2014) 102–115
115
[4] A. Ambainis, A. Yakaryilmaz, Superiority of exact quantum automata for promise problems, Inform. Process. Lett. 112 (2012) 289–291. [5] A. Bertoni, C. Mereghetti, B. Palano, Quantum computing: 1-way quantum automata, in: Proc. 7th Conference on Developments in Language Theory (DLT 2003), in: Lecture Notes in Comput. Sci., vol. 2710, Springer, 2003, pp. 1–20. [6] A. Bertoni, C. Mereghetti, B. Palano, Small size quantum automata recognizing some regular languages, Theoret. Comput. Sci. 340 (2005) 394–407. [7] A. Bertoni, C. Mereghetti, B. Palano, Some formal tools for analyzing quantum automata, Theoret. Comput. Sci. 356 (2006) 14–25. [8] A. Bertoni, C. Mereghetti, B. Palano, Trace monoids with idempotent generators and measure-only quantum automata, Nat. Comput. 9 (2010) 383–395. [9] M.P. Bianchi, B. Palano, Behaviours of unary quantum automata, Fund. Inform. 104 (2010) 1–15. [10] A. Brodsky, N. Pippenger, Characterizations of 1-way quantum finite automata, SIAM J. Comput. 5 (2002) 1456–1478. [11] C. Comin, M.P. Bianchi, Algebraic characterization of the class of languages recognized by Measure Only Quantum Automata, in: Proc. 13th Italian Conference on Theoretical Computer Science 2012 (ICTCS 2012), 2012, pp. 90–93. [12] M. Golovkins, M. Kravtsev, Probabilistic reversible automata and quantum automata, in: Proc. 8th International Computing and Combinatorics Conference, in: Lecture Notes in Comput. Sci., vol. 2387, Springer, 2002, pp. 574–583. [13] M. Hirvensalo, Quantum automata with open time evolution, Int. J. Natur. Comput. Res. 1 (2010) 70–85. [14] J.E. Hopcroft, R. Motwani, J.D. Ullman, Introduction to Automata Theory, Languages, and Computation, Addison–Wesley, Reading, MA, 2001. [15] R.I.G. Hughes, The Structure and Interpretation of Quantum Mechanics, Harvard University Press, Cambridge, MA, 1992. [16] O. Klíma, L. Polák, On varieties of literally idempotent languages, Theor. Inform. Appl. 42 (2008) 583–598. [17] A. Kondacs, J. Watrous, On the power of quantum finite state automata, in: Proc. 38th Symposium on Foundations of Computer Science (FOCS 1997), 1997, pp. 66–75. [18] M. Mercer, Lower bounds for generalized quantum finite automata, in: Proc. 2nd International Conference on Language and Automata Theory and Applications (LATA 2008), in: Lecture Notes in Comput. Sci., vol. 5196, Springer, 2008, pp. 373–384. [19] C. Mereghetti, B. Palano, Quantum finite automata with control language, Theor. Inform. Appl. 40 (2006) 315–332. [20] C. Mereghetti, B. Palano, Quantum automata for some multiperiodic languages, Theoret. Comput. Sci. 387 (2007) 177–186. [21] C. Moore, J. Crutchfield, Quantum automata and quantum grammars, Theoret. Comput. Sci. 237 (2000) 275–306. [22] A. Nayak, Optimal lower bounds for quantum automata and random access codes, in: Proc. 40th Symposium on Foundations of Computer Science (FOCS 1999), 1999, pp. 369–376. [23] J.-E. Pin, Varieties of Formal Languages, North Oxford Academic, Plenum, 1986. [24] M.O. Rabin, Probabilistic automata, Inform. Control 6 (1963) 230–245. [25] K. Scharnhorst, Angles in complex vector spaces, Acta Appl. Math. 69 (2001) 95–103. [26] G. Shilov, Linear Algebra, Prentice Hall, 1971. Reprinted by Dover, 1977. [27] S. Zheng, J. Gruska, D. Qiu, On the state complexity of semi-quantum finite automata, in: Proc. 8th International Conference on Language and Automata Theory and Applications (LATA 2014), in: Lecture Notes in Comput. Sci., vol. 8370, Springer, 2014, pp. 601–612. [28] S. Zheng, D. Qiu, L. Li, J. Gruska, One-way finite automata with quantum and classical states, in: Languages Alive, in: Lecture Notes in Comput. Sci., vol. 7300, Springer, 2012, pp. 273–290.