Information Processing Letters 113 (2013) 160–164
Contents lists available at SciVerse ScienceDirect
Information Processing Letters www.elsevier.com/locate/ipl
Exact output rate of Peres’s algorithm for random number generation Sung-il Pae Hongik University, Seoul, Republic of Korea
a r t i c l e
i n f o
a b s t r a c t
Article history: Received 2 August 2012 Received in revised form 30 November 2012 Accepted 31 December 2012 Available online 3 January 2013 Communicated by B. Doerr Keywords: Analysis of algorithms Random number generation Von Neumann’s method Peres’s method Elias’s method
An exact computation of the output rate of Peres’s algorithm is reported. The algorithm, recursively defined, converts independent flips of a biased coin into unbiased coin flips at rates that approach the information-theoretic upper bound, as the input size and the recursion depth tend to infinity. However, only the limiting rate with respect to the input size is known for each recursion depth. We compute the exact output rate for each fixedlength input and compare it with another asymptotically optimal method by Elias. © 2013 Elsevier B.V. All rights reserved.
1. Introduction Suppose that we have a coin that turns heads (denoted H ) with probability p, and thus turns tails (T ) with probability q = 1 − p. In order to obtain an unbiased random bit, von Neumann’s trick [1] takes two coin flips; if the result is H T , then output 1; if the result is T H , then output 0; if we have H H or T T , then ignore the result and repeat the process until we get 0 or 1. The obtained bit is unbiased because Pr( H T ) = Pr( T H ) = pq. By applying this procedure repeatedly on a sequence of independent coin flips, taking two flips each time, we obtain a sequence of random bits which are unbiased and independent. Generalizing and formalizing this idea [2–4], a randomizing function f : {0, 1}n → {0, 1}∗ takes as input n independent bits of bias p (called the Bernoulli source of bias p) and returns a string of independent and unbiased random bits. For example, the function Ψ1 : {0, 1}2 → {0, 1}∗ , defined by
Ψ1 (00) = Ψ1 (11) = λ,
Ψ1 (01) = 1,
Ψ1 (10) = 0,
E-mail address:
[email protected]. 0020-0190/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.ipl.2012.12.012
where λ is an empty string, corresponds to von Neumann’s procedure. We call it von Neumann’s function, and it is the simplest non-trivial randomizing function [3]. The rate of a randomizing function (or a procedure) is the average number of output bits per one input bit. It is known to be bounded above by Shannon entropy H ( p ). When p = 1/3, the rate of von Neumann’s procedure is pq = 2/9 ≈ 0.22 (see, for example, Exercise 5.1-3 in [5]) while the entropy bound H (1/3) ≈ 0.92; the discrepancy is quite large. But it is also known that there are randomizing functions that achieve rates arbitrarily close to the entropy bound. Two of such methods were proposed by Elias [2] and Peres [6]. Elias proposed a series of randomizing functions E n : {0, 1}n → {0, 1}∗ whose rate approaches to the entropy bound, as n tends to infinity. Pae and Loui [4,3] showed that Elias’s method is in fact optimal, that is, each E n ’s rate is maximum among all the possible randomizing functions that take length-n inputs from a Bernoulli source. Peres’s method is also asymptotically optimal. However, for a fixed input length, Peres’s method is not strictly optimal as Elias’s. What makes Peres’s method interesting is that it is defined by a simple recursion, so its implementation is
S. Pae / Information Processing Letters 113 (2013) 160–164
straightforward and runs very fast (in O (n log n) time for input length n) with a very small footprint. On the other hand, Elias’s method does not seem to be computable as easily. Refer to Pae and Loui [3], and Ryabko and Matchikina [7] for further discussion on the output rate and computation of Elias’s method. From the perspective of output rate, Peres algorithm’s rate is known only as a limit as the input length tends to infinity [6], while Elias’s is exactly known for each input length [3]. In this letter, we report an exact computation of rate of Peres’s algorithm and present the result in comparison with Elias’s algorithm. 2. Peres’s algorithm
161
where r1 ( p ) = pq. Solving this recursion, we obtain exact values of rν ( p ). For example,
r2 ( p ) = pq +
+
1 2
1 2
p 2 + q2 1 − p 2 − q2
p2 1 −
p2 p 2 + q2
.
The asymptotic rate rν ( p ) is known to approach to the entropy bound as ν increases [6]. For a fixed-length input, the rate is strictly smaller. Although we can compute the rate by brute force by running Peres’s function for all 2n inputs, it becomes practically impossible quickly, when input size n grows larger than, say 32.
Consider the functions u and v on {0, 1}2 defined as follows:
3. Analysis of Peres’s algorithm
u (00) = u (11) = 0,
3.1. Non-truncated Peres’s function Ψ
v (00) = 0,
u (01) = u (10) = 1,
v (11) = 1,
v (01) = v (10) = λ.
ν in (1), and consider a recursion Ψ (x) = Ψ1 (x) ∗ Ψ u (x) ∗ Ψ v (x) . Drop the subscript
The function u is the exclusive-or and v may be considered as a complement of von Neumann’s function. Extend the three functions Ψ1 , u, and v to {0, 1}∗ , for even-length input by letting (and the same for u and v)
Ψ1 (x1 x2 . . . x2n ) = Ψ1 (x1 x2 ) ∗ · · · ∗ Ψ1 (x2n−1 x2n ), where ∗ is concatenation, and for odd-length input, by dropping the last bit and taking the remaining even-length bits. Now, for a bit string x, Peres’s function Ψν (x), for ν = 1, 2, . . . , is defined recursively as follows [6]:
Ψν (x) = Ψ1 (x) ∗ Ψν −1 u (x) ∗ Ψν −1 v (x) .
(1)
Then Ψν can take a bit string of any length, and for each ν = 1, 2, . . . , Ψν is n-randomizing for any positive n. For example,
Ψ3 (0010110101001) = Ψ1 (001011010100) ∗ Ψ2 u (001011010100) ∗ Ψ2 v (001011010100) = 011 ∗ Ψ2 (010110) ∗ Ψ2 (010) = 011 ∗ Ψ1 (010110) ∗ Ψ1 (111) ∗ Ψ1 (λ) ∗ Ψ1 (01) ∗ Ψ1 (1) ∗ Ψ1 (λ) = 011 ∗ 111 ∗ 1 = 0111111. Despite the flexibility and simple definition, contrary to Elias’s function, whose rate is exactly known [3], the rate of Peres’s function is known only asymptotically. Denote by | y | the length of binary string y, and let
Then the original Peres’s function Ψν can be considered as a truncated version of Ψ , with which the recursion stops at ν th level. Functions Ψ1 (x), u (x), and v (x) are very simple, and, given an n-bit string x, at each level of recursive computation they are computed in O (n) time. Since the length of u (x) and v (x) is at most n/2, the next level of recursive computation takes input of length at most n/2. So, if A (n) is the time complexity of the computation of Ψ (x), it satisfies the recursion
A (n) = O (n) + A
n
2
P (2n, k) =
Ψ (x).
x∈ S 2n,k
r (2n, p ) =
rν ( p ) = r1 ( p ) +
+
1 2
2
2
2
r ν −1 p + q 2
p + q r ν −1 p
2
2
p +q , 2
2
(2)
2
By S n,k , we denote the subset of {0, 1}n that consists of binary strings with k 1’s. If we take n samples from a Bernoulli(p) source, then S n,k is the set of equiprobable binary strings and their probability is pn−k qk . Consider the Peres function Ψ on inputs of length 2n. Define the function P (2n, k) to be the total number of output bits over S 2n,k , that is,
The rate rν ( p ) satisfies the recursion (see [6])
n .
3.2. Total output length on equiprobable set S 2n,k
Then the rate of Ψ is
n→∞
+A
Therefore, A (n) = O (n log n), and Ψν can be computed in the same time bound. The function Ψ is randomizing for each input length, and its rate bounds the rates of Ψν for each ν . For the rest of this letter, we focus on the computation of exact rate of Ψ and Ψν .
rν ( p ) = lim E|x|=n Ψν (x) /n.
1
=
1 2n
Ψ (x) Pr(x) x∈{0,1}2n
2n 1
2n
k =0
P (2n, k) p 2n−k qk .
162
S. Pae / Information Processing Letters 113 (2013) 160–164
In the following, we give a recursive formula for P (2n, k) so that its values can be computed, for example, by dynamic programming. Fix k, and assume that k n.1 In order to count the outputs over S 2n,k , decompose S 2n,k into C l ’s, where
C l = x ∈ S 2n,k Ψ1 (x) = l . Depending on the parity of k, the output size of von Neumann function over S 2n,k ranges either in even values from 0 to k, when k is even, or, in odd values from 1 and k, when k is odd. Then we have the decomposition of S 2n,k by C l ’s:
S 2n,k =
C 0 ∪ C 2 ∪ · · · ∪ Ck C 1 ∪ C 3 ∪ · · · ∪ Ck
if k is even, if k is odd.
(3)
Now focus on C l , and the following lemma enables us to achieve our goal.
if zi = 0, remove the leading pair of b and put it on the tail of x; otherwise, remove the leading pair of a and put it on the tail of x. Because a and b are unique, the resulting string x is unique. It is clear by construction that Φ ◦ Φ and Φ ◦ Φ
are identity maps. 2 By Lemma 1, we have
|C l | = 2l
n−l
n
Proof. Let x ∈ C l ⊂ S 2n,k . Recall that k n. Among the pairs x1 x2 , x3 x4 , . . . , x2n−1 x2n , the number of 01 or 10 in x is l, and this implies that there are n − (k + l)/2 00’s and (k − l)/2 11’s. Conversely, a combination x of l 01 or 10’s, n − (k + l)/2 00’s and (k − l)/2 11’s belongs to C l because Ψ1 (x) ∈ {0, 1}2 and x ∈ S 2n,k . So C l consists of exactly such strings, and one example is 2n−k−l
k−l
2l
.
k−l 2
l
Noting that |Ψ (x)| = |Ψ1 (x)| + |Ψ (u (x))| + |Ψ ( v (x))|, we obtain the total output length over C l . First, since |Ψ1 (x)| = l, for every x ∈ C l , we have
Ψ1 (x) = 2l n n − l l. k−l l
x∈C l
Lemma 1. The mapping Φ : x → (Ψ1 (x), u (x), v (x)) is one-toone correspondence between C l and {0, 1}l × S n,l × S n−l,(k−l)/2 .
(4)
2
Now, again by the correspondence of Lemma 1 for each z ∈ S n,l , there are 2l
z∈ S n,l
n−l
x’s such that u (x) = z, and
k−l 2
|Ψ (z)| = P (n, l). So we have
Ψ u (x) = 2l n − l P (n, l). k−l
x∈C l
(5)
2
Similarly, for each w ∈ S n−l,(k−l)/2 , there are 2l that v (x) = w, and thus
and
Combine (4), (5) and (6), and we have n−l
l n− k+ 2
l
u (x0 ) = 0 . . . 0 1 . . . 1, k−l 2
n−l
Such bitstrings are mapped by Φ into {0, 1}l × S n,l × S n−l,(k−l)/2 . To show the one-to-one correspondence, we give an inverse map Φ as follows: Given ( y , z, w ) ∈ {0, 1}l × S n,l × S n−l,(k−l)/2 , first, make up a bitstring a of length 2l with 01’s and 10’s so that Ψ1 (a) = y. Note that such a is unique. Also make up b of length 2n − 2l with 00’s 11’s so that v (b) = w. This b is unique, too. Now starting from the first bits of z = z1 . . . zn , we construct x which is initially an empty string, by repeating the following:
This assumption that k n, though not essential, makes the discussion below simpler. If n < k 2n, then the decomposition (3) becomes, instead, 1
S 2n,k =
C 0 ∪ C 2 ∪ · · · ∪ C n−k C 1 ∪ C 3 ∪ · · · ∪ C n−k
P (2n, k) =
nn − l 2l
l
+
v (x0 ) = 0 . . . 0 1 . . . 1 .
if k is even, if k is odd.
However, in view of Lemma 2, we don’t have to worry about this case.
x’s such
l
n l
(6)
2
l
x∈C l
l
l
Ψ v (x) = 2l n P n − l, k − l .
x0 = 00 . . . 00 11 . . . 11 0101 . . . 01,
Ψ1 (x0 ) = 1 . . . 1,
n
k−l 2
P n − l,
l+
k−l 2
n−l k−l 2
P (n, l)
,
(7)
where the summation is over l’s given in (3). Recall that we assumed that k n. If k > n, then we can reduce the case to the previous case using the following lemma. Lemma 2. Let x¯ be the binary string resulted by exchanging 0 and 1 in x. Then |Ψ (¯x)| = |Ψ (x)|. Proof. Easy to see that Ψ1 (¯x) = Ψ1 (x), u (¯x) = u (x), and v (¯x) = v (x). The claim holds by induction on the size of x, since Ψ (x) = Ψ1 (¯x) ∗ Ψ (u (¯x)) ∗ Ψ ( v (¯x)) = Ψ1 (x) ∗ Ψ (u (x)) ∗ Ψ ( v (x)). 2 So we have as a part of our recursive definition,
P (2n, k) = P (2n, 2n − k),
if k > n.
(8)
Note also that, on the right hand side of Eq. (7), rel ), can be made for cursive calls to P (n, l) and P (n − l, k− 2 odd-length inputs, to which the structure lemma is not applied. If x is of odd-length, then Ψ (x) = Ψ (x ), where x is
S. Pae / Information Processing Letters 113 (2013) 160–164
Fig. 1. Rates of Elias’s function, Peres’s functions Ψ and Ψν for p = 1/3, and input sizes are even values from 2 to 130.
ν = 1, . . . , 5; entropy bound and asymptotic Peres’s rates for ν = 1, . . . , 8. Source bias
the string obtained by dropping the last bit of x, hence, of an even length. If x ∈ S n,k , then x is either in S n−1,k−1 or S n−1,k , and the operation x → x gives the well-known one-to-one correspondence between S n,k and S n−1,k−1 ∪ S n−1,k . So, if n is odd, then
P (n, k) = P (n − 1, k − 1) + P (n − 1, k),
= 7 37069 15585 63057 93374 77205 29592. Corollary 3. The output lengths of Ψ on {0, 1}2n , hence the exact rate, can be computed recursively without computing Ψ (x) explicitly. 3.3. Rate of truncated cases Ψν The rate of truncated Peres’s functions can be computed similarly. The formula (7) becomes now
2l
l
+
n l
l
k−l 2
l+
P ν −1 n − l ,
Using these definitions, we can compute, for example,
P 3 (2 · 50, 50)
= 5 83373 67766 54131 66698 53078 88024.
(10)
P (2 · 50, 50)
nn − l
P 0 (2n, k) = 0.
4. Results
for k = 0, . . . , 2n,
and the recursive definition for P (2n, k) is completed by formulae (7)–(10). Its values can be computed efficiently, for example, by dynamic programming. For example,
P ν (2n, k) =
In addition to appropriate versions of the other definitions (8)–(10), we need
(9)
which provides a measure to compute P (n, k) when n is odd. Finally, the initial condition is given as
P (1, k) = 0,
163
n−l
k−l 2
k−l 2
.
P ν −1 (n, l)
Fig. 1 shows, for bias p = 1/3, the rates of Peres’s function Ψ , Ψν for ν = 1, . . . , 5, together with the rates of Elias’s functions E n . The rates are computed for even input lengths from 2 to 130. Note that, for each input length n, Elias’s function E n is used, while a Peres function is used for all the input lengths. Also shown are the asymptotic rates rν ( p ) of Ψν , and the entropy bound. The rates of Ψν and Ψ are the same for input lengths less than 2ν . For longer inputs, it becomes strictly smaller and increases as approaching to the asymptotic rate rν ( p ). Output rate of Peres’s function increases much slower than Elias’s. For input length larger than 56, 90 percent of entropy is reached by Elias’s method, and 95 percent when n = 128. Peres’s method reaches 70 percent of the entropy when n = 64, and near 80 percent when n = 256, 89 percent when n = 2048. 5. Remarks Given input length n, Elias’s method produces more unbiased bits than Peres’s algorithm, and using our result, we know now how much. However, Elias’s method
164
S. Pae / Information Processing Letters 113 (2013) 160–164
appears to be computationally more expensive. The fastest known algorithm by Ryabko and Matchikina runs in O (n log3 n log n) time and uses integer multiplication algorithm by Schönhage and Strassen [8] which becomes practical only when the number of digits is very large, not to mention the complexity of implementation. In comparison, Peres’s algorithm runs in O (n log n) time as shown above, and its implementation is simple. So there is a tradeoff between computational cost and output rate if we are to choose between the two algorithms, and it will take a careful examination on the factors like running time, implementation cost, and memory usage, to determine which algorithm to use in practice. Acknowledgement This work was supported in part by a Hongik University grant and the National Research Foundation of Korea (NRF) grant funded by the Korean government (No. 20090077288).
References [1] J. von Neumann, Various techniques for use in connection with random digits. Notes by G.E. Forsythe, in: Monte Carlo Method, in: Applied Mathematics Series, vol. 12, U.S. National Bureau of Standards, Washington D.C., 1951, pp. 36–38, reprinted in: Collected Works of von Neumann, vol. 5, Pergamon Press, 1963, pp. 768–770. [2] P. Elias, The efficient construction of an unbiased random sequence, The Annals of Mathematical Statistics 43 (3) (1972) 865–870. [3] S. Pae, M.C. Loui, Randomizing functions: Simulation of discrete probability distribution using a source of unknown distribution, IEEE Transactions on Information Theory 52 (11) (2006) 4965–4976. [4] S. Pae, M.C. Loui, Optimal random number generation from a biased coin, in: Proceedings of the Sixteenth Annual ACM–SIAM Symposium on Discrete Algorithms, 2005, pp. 1079–1088. [5] C. Stein, T. Cormen, R. Rivest, C. Leiserson, Introduction to Algorithms, 2nd edition, MIT Press, 2001. [6] Y. Peres, Iterating von Neumann’s procedure for extracting random bits, Annals of Statistics 20 (1) (1992) 590–597. [7] B.Y. Ryabko, E. Matchikina, Fast and efficient construction of an unbiased random sequence, IEEE Transactions on Information Theory 46 (3) (2000) 1090–1093. [8] A. Schönhage, V. Strassen, Schnelle Multiplikation grosser Zahlen, Computing 7 (1971) 281–292.