Products of random variables and the first digit phenomenon

Products of random variables and the first digit phenomenon

Available online at www.sciencedirect.com ScienceDirect Stochastic Processes and their Applications ( ) – www.elsevier.com/locate/spa Products of ...

359KB Sizes 0 Downloads 60 Views

Available online at www.sciencedirect.com

ScienceDirect Stochastic Processes and their Applications (

)

– www.elsevier.com/locate/spa

Products of random variables and the first digit phenomenon Nicolas Chenavier, Bruno Massé ∗, Dominique Schneider Univ. Littoral Côte d’Opale, EA 2597 — Laboratoire de mathématiques pures et appliquées Joseph Liouville, F-62228 Calais, France Received 28 March 2017; received in revised form 17 July 2017; accepted 7 August 2017 Available online xxxx

Abstract We provide conditions on dependent on (∏and ) non-stationary random variables X n ensuring that the n mantissa of the sequence of products X is almost surely distributed following Benford’s law or k 1 converges in distribution to Benford’s law. This is achieved through proving new generalizations of Lévy’s and Robbins’s results on distribution modulo 1 of sums of independent random variables. c 2017 Elsevier B.V. All rights reserved. ⃝ MSC 2010: 60B10; 11B05; 11K99 Keywords: Benford’s law; Density; Mantissa; Weak convergence

1. Introduction Let b > 1. Benford’s law in base b is the probability measure µb on the interval [1, b[ defined by µb ([1, a[) = logb a

(1 ≤ a < b),

where logb a denotes the logarithm in base b of a. The mantissa in base b of a positive real number x is the unique number Mb (x) in [1, b[ such that there exists an integer k satisfying x = Mb (x)bk . ∗ Corresponding author.

E-mail addresses: [email protected] (N. Chenavier), [email protected] (B. Mass´e), [email protected] (D. Schneider). http://dx.doi.org/10.1016/j.spa.2017.08.003 c 2017 Elsevier B.V. All rights reserved. 0304-4149/⃝

Please cite this article in press as: N. Chenavier, et al., Products of random variables and the first digit phenomenon, Stochastic Processes and their Applications (2017), http://dx.doi.org/10.1016/j.spa.2017.08.003.

2

N. Chenavier et al. / Stochastic Processes and their Applications (

)



When a sequence of positive random variables (X n ) is of a type usually considered by probabilists and statisticians, there is little to be said on (Mb (X ( n )) ((see )) 2.1 in Section 2.2 ∏n Remark for instance) while by contrast there is much to report on Mb X as we( will (see. k 1 )) ∏n Our purpose is therefore to exhibit conditions on X n ensuring that the sequence Mb 1 Xk is almost distributed following µb (see Definition 2.1) or ensuring that the law of (∏n surely ) Mb 1 X k converges weakly to µb as n → +∞. We hope that this will enlarge, to a certain extent, the field of applications of the Benford’s law (see Section 1.1 for examples of such applications). To best of))our knowledge, apart from [27], the known results on the asymptotic behaviour ( the(∏ n of Mb only deal with the cases where the X n are independent and identically 1 Xk distributed and the situations where X n = X for n ≥ 1 and X is some random variable (see Section 1.3 for details). 1.1. The first digit phenomenon Benford [2] noticed in 1938 that many real-life lists of numbers have a strange property: numbers whose mantissae are small are more numerous than those whose mantissae are large. This fact is called the First Digit Phenomenon. He also noticed that this phenomenon seems independent of the units. This led him to make a scale-invariance hypothesis (more or less satisfied in real life) from which he derived that µ10 can be seen as the (ideal) distribution of digits or mantissa of many real-life numbers. Of course, this ideal distribution is never achieved in practice. Several mathematicians have been involved in this subject and have provided sequences of positive numbers whose mantissae are (or approach to be) distributed following µb in the sense of the natural density [1,7,9,12,25] (see Definition 2.1), random variables whose mantissa law is or approaches µb [3,11,16,19,22], sequences of random variables whose mantissae laws converge to µb or whose mantissae are almost surely distributed following µb [24,27,31,36]. Some convergence rates for the mantissa of products of i.i.d. random variables are provided by Schatte in [33,35]. The same author gives in [34] a survey on mantissa distribution and conjectures that extensive computing leads to numbers whose mantissae are close to be distributed following µ B . Among the many applications of the First Digit Phenomenon, we can quote: fraud detection [29], computer design [15,20] (data storage and roundoff errors), image processing [37] and data analysis in natural sciences [28,32]. See [5,26] for more details. 1.2. Content Section 2 is devoted to notation, definitions and tools from Uniform Distribution Theory. (h) Our main (∑n 3.3 and ))3.9, are presented in Section 3. They both involve TN := ∑ N results,( Theorems (1/N ) n=1 exp 2iπ h 1 logb X k . Theorem ( (3.3 )) that, under the assumption that the ∏n states sequence (X n ) is stationary, the sequence Mb X is almost surely distributed following k 1 µb if and only if, for every positive integer h, ETN(h) converges to 0 as N → +∞. Theorem 3.9 states that the following condition, without constraints on the dependence and on the stationarity ∑∞ E|TN(h) | p < +∞ for some p ≥ 1. of the X n , is sufficient: for every positive integer h, N =1 N These properties are used in Section 4 to investigate the cases where the random variables X n are stationary and log-normal, are exchangeable, are stationary and 1-dependent and the case where they are independent and non-stationary. We provide in the Appendix a survey of the main known properties of Benford’s law (scale-invariance, power-invariance and invariance under mixtures). We think that this might help, together with Section 1.3, to put our results in perspective. Please cite this article in press as: N. Chenavier, et al., Products of random variables and the first digit phenomenon, Stochastic Processes and their Applications (2017), http://dx.doi.org/10.1016/j.spa.2017.08.003.

N. Chenavier et al. / Stochastic Processes and their Applications (

)



3

1.3. Known results on random product sequences When the X n are independent and identically distributed, ( (∏n )) • the sequence Mb is almost surely distributed following µb if and only if for 1 Xk every positive integer h E(exp(2iπ hlogb X 1 )) ̸= 1, that is to say if and only if the common law of the X n is not supported by any set z (h positive integer); {b h : z integer}(∏ ) n • the law of Mb X converges weakly to µb if and only if for every positive integer h k 1 | E(exp(2iπ hlogb X 1 ))| ̸= 1, that is to say if and only if the common law of the X n is not supported by any set z {ba+ h : z integer} (a ∈ [0, 1[, h positive integer). A proof of the first statement is available in [31]. The second statement is a direct consequence of Lemma 2.4 (see also [23]). Moreover • the sequence (Mb (X n )) is almost surely distributed following µb if and only if P(X ∈ {br : r rational}) = 0; • the law of Mb (X n ) converges weakly to µb if and only if, for every positive integer h, lim E(exp(2iπ nhlogb X )) = 0.

n→∞

The two above conditions are fulfilled when X (and hence logb X ) admits a density. The first statement derives from the fact that the sequence (Mb (cn )) is distributed following µb if and only if logb c is irrational (see [21, p. 8] and Section 2.2). Again the second statement is a direct consequence of Lemma 2.4. (∏ n ) It is worth noting that, in the situations discussed above, the law of Mb 1 X k converges weakly to µb in most cases for every value of b, while there does not exist any random variable Z such that the law of Mb (Z ) is µb for every value of b (see Appendix A.2.1). The following example shows the kind of difficulties that can arise when the X n are neither independent nor stationary. Example 1.1. Consider an i.i.d. sequence of random variables (Z m ) with common law µb . Set X 2m−1 = Z m and X 2m = b/Z m (m ≥ 1). Then the random variables X n are identically distributed following µb because ( ) ( ) ( ) b b b P
Please cite this article in press as: N. Chenavier, et al., Products of random variables and the first digit phenomenon, Stochastic Processes and their Applications (2017), http://dx.doi.org/10.1016/j.spa.2017.08.003.

4

N. Chenavier et al. / Stochastic Processes and their Applications (

)



2. Preliminaries We present here some useful notation and definitions and the relationship between the study of Benford’s law and Uniform Distribution Theory. 2.1. Other notation and definitions We shall consistently use the following notation through this paper: whenever (X n ) is a given sequence of positive random variables, we set Yn =

n ∏ k=1

Xk

and

Ym,n =

n ∏

Xk

(1 ≤ m ≤ n) .

k=m

Here is some other notation used in this article: the natural logarithm is denoted by log; for any real x and any integer h, we set eh (x) = exp(2iπ hx) where i 2 = −1; the symbol {x} stands for the fractional part of a real x; we write Z+ for the set of positive integers; the standard abbreviations a.s., r.v. and i.i.d. stand respectively for almost surely (or almost sure), random variable and independent and identically distributed; all the r.v.’s in consideration are supposed to be defined on the same probability space (Ω , T , P) and the law of a r.v. Z is denoted PZ . Definition 2.1. A sequence (Un ) of real numbers in [1, b[ is called Benford in base b if it is distributed as µb , that is to say if lim

N →+∞

N 1 ∑ 1[1, a[ (Un ) = logb a N n=1

(1 ≤ a < b) .

A sequence (u n ) of positive numbers is also called Benford in base b if the sequence of mantissae (Mb (u n )) is Benford in base b. For instance, the sequences (n!), (n n ) and (cn ) (with logb c irrational) are Benford in base b [30]. The sequences (n) and (log n) and the sequence of prime numbers are not [13]. See [1,25] for more examples of Benford sequences. Definition 2.2. We say that a positive random variable Z is Benford in base b when PMb (Z ) = µb , that a sequence of positive random variables (Z n ) is a.s. Benford in base b when P({ω : (Z n (ω)) is a Benford sequence in base b}) = 1, and that Z n tends to be Benford in base b when the sequence (PMb (Z n ) ) converges weakly to µb . These notions are connected (see Section 1.3, Remark 2.1 and Theorem 3.3 and its corollaries) but are however significantly ( different ) one from the other. Indeed, suppose that Z n = n! a.s. for n ≥ 1. Then the sequence PMb (Z n ) does not converge weakly while the sequence (Z n ) is a.s. Benford in base b. Conversely, suppose that Z n = T (n ≥ 1) where T is Benford in base b. Then Z n tends to be Benford in base b, but the sequence (Z n ) is a.s. not Benford. 2.2. Benford law and uniform distribution modulo 1 It is well known and easy to verify that a sequence (u n ) of positive numbers is Benford in base b if and only if the sequence of fractional parts ({logb u n }) is uniformly distributed in [0, 1[, that Please cite this article in press as: N. Chenavier, et al., Products of random variables and the first digit phenomenon, Stochastic Processes and their Applications (2017), http://dx.doi.org/10.1016/j.spa.2017.08.003.

N. Chenavier et al. / Stochastic Processes and their Applications (

)



5

is to say lim

N →+∞

N 1 ∑ 1[0, c[ ({logb u n }) = c N n=1

(0 ≤ c < 1) ,

that a positive random variable Z is Benford in base b if and only if P{logb Z } is the uniform probability on [0, 1[ and that a sequence (Z n ) of positive random variables tends to be Benford in base b if and only if the sequence (P{logb Z n } ) converges weakly to the uniform distribution in [0, 1[. Combining this with the celebrated Weyl’s Criterion [21, p. 7] yields the following lemma. Lemma 2.3. Consider a sequence (u n ) of positive numbers and a sequence (Z n ) of positive r.v.’s. Then (u n ) is Benford in base b if and only if ∀h ∈ Z , +

lim

N →+∞

N 1 ∑ eh (logb u n ) = 0, N n=1

and (Z n ) is a.s. Benford in base b if and only if ∀ h ∈ Z+ ,

lim

N →+∞

N 1 ∑ eh (logb Z n ) = 0 N n=1

a.s.

L´evy’s Theorem states that the weak convergence of a sequence (µn ) of probability measures to a probability measure µ is equivalent to the pointwise convergence of the characteristic function of µn to that of µ. On the torus R/Z, the convergence of the Fourier coefficients suffices [6, p. 363]. Since for every x > 0 and every h ∈ Z+ , eh ({logb x}) = eh (logb x), we get the following characterizations. Lemma 2.4. Consider a positive r.v. Z and a sequence (Z n ) of positive r.v.’s. Then Z is Benford in base b if and only if ( ) ∀ h ∈ Z+ , E eh (logb Z ) = 0, and Z n tends to be Benford in base b if and only if [ ] ∀ h ∈ Z+ , lim E eh (logb Z n ) = 0. n→+∞

We are now able to treat the remark evoked at the beginning of Section 1. Remark 2.1. Let Z 1 , Z 2 , . . . be identically distributed positive random variables (independent or not, stationary or not). Of course Z n tends to be Benford in base b if and only if each Z n is Benford in base b. More interesting is the fact that, if (Z n ) is a.s. Benford in + base b, then each Z n must be Benford in base ( ) b. Indeed, for every h ∈ Z and N ≥ ∑N 1, E(eh (logb Z 1 )) = E (1/N ) 1 eh (logb Z n ) . So, by Lebesgue’s Dominated Convergence ∑ Theorem, E(eh (logb Z 1 )) = 0 when (1/N ) 1N eh (logb Z n ) converges a.s. to 0. This situation is unlikely to occur since very few r.v.’s are Benford (see the Appendix). 3. General conditions We present in this section the two main results of our paper: Theorems 3.3 and 3.9. Theorem 3.3 gives a necessary and sufficient condition ensuring that (Yn ) is a.s. Benford under the assumption that the sequence (X n ) is stationary. Theorem 3.9 gives a sufficient condition without constraints on the X n . Please cite this article in press as: N. Chenavier, et al., Products of random variables and the first digit phenomenon, Stochastic Processes and their Applications (2017), http://dx.doi.org/10.1016/j.spa.2017.08.003.

6

N. Chenavier et al. / Stochastic Processes and their Applications (

)



3.1. The first main result The proof of Theorem 3.3 will use two lemmas. The first one below is a simple application of Riesz’s summation methods. It is also connected with the N¨ordlund summation methods. Lemma 3.1. Let (al ) be a sequence of complex numbers. Then ( ) ( ) L L 1∑ 1 ∑ L −l lim al = 0 ⇒ lim al = 0 . L→+∞ L L→+∞ L L l=1 l=1 Proof. For every L ≥ 1 L L L 1∑ 1 ∑ L −l 1 ∑ al − lal . al = 2 L l=1 L l=1 L L l=1 ( ∑ ) ( ∑ ) L L But the sequences L1 l=1 al and L22 l=1 lal are simultaneously convergent and have the same limit when they converge [21, p. 63]. This concludes the proof. □

The first statement in Lemma 3.2 is known as the van der Corput Fundamental Inequality [21, p. 25]. The second statement is a direct consequence of the first one. Lemma 3.2. Let z n be a complex number for 1 ≤ n ≤ N . Then, for 1 ≤ L ≤ N , ⎞ ⎛ ⏐ N ⏐ L−1 N −l N ⏐∑ ⏐2 ∑ ∑ ∑ N +L −1 N +L −1 ⏐ ⏐ zn ⏐ ≤ (L − l) (z j+l z j )⎠ . |z n |2 + 2 Re ⎝ ⏐ ⏐ ⏐ L L2 n=1

l=1

n=1

j=1

In particular, if x1 , . . . , x N are real numbers, ⎞ ⎛ ⏐2 ⏐ N −l ( N L−1 ⏐ ⏐1 ∑ ) ∑ ∑ 2 N + L − 1 ⏐ ⏐ ei(x1 +···+xn ) ⏐ ≤ + 2 Re ⎝ (L − l) ei (x j+1 +···+x j+l ) ⎠ ⏐ ⏐N ⏐ L L2 N 2 n=1

l=1

j=1

for 1 ≤ L ≤ N . We are now prepared to prove our first main result. Theorem 3.3. Suppose that (X n ) is stationary. Then the sequence (Yn ) is a.s. Benford in base b if and only if ∀ h ∈ Z+ ,

lim

L→+∞

L 1∑ E(eh (logb Yl )) = 0 . L l=1

Proof. The direct part derives from the Dominated Convergence Theorem. It remains to prove the converse part. Consider some h ∈ Z+ . By Lemma 3.2, for 1 ≤ L ≤ N , ⎛ ⎞ ⏐ ⏐2 N L−1 N −l ⏐1 ∑ ⏐ ∑ ∑ 2 N + L − 1 L − l ⏐ ⏐ eh (logb Yn )⏐ ≤ + 2 Re ⎝ eh (logb Y j+1, j+l )⎠ . (1) ⏐ ⏐N ⏐ L L N L N n=1 l=1 j=1 Please cite this article in press as: N. Chenavier, et al., Products of random variables and the first digit phenomenon, Stochastic Processes and their Applications (2017), http://dx.doi.org/10.1016/j.spa.2017.08.003.

N. Chenavier et al. / Stochastic Processes and their Applications (

)

7



Fix L ≥ 1 and l ≤ L, set Z j = eh (logb Y j+1, j+l ) ( j ≥ 0) and suppose that (X n ) is stationary. Then (Z j ) is stationary and by Birkhoff’s Theorem lim

N →+∞

N −l N −l 1 ∑ 1 ∑ Z j = lim Z j = EBl (Z 0 ) = EBl (eh (logb Yl )) N →+∞ N − l N j=1 j=1

a.s.

where Bl stands for the σ -algebra of invariant sets. Hence (1) yields for every L ≥ 1 ⏐ ⏐2 ( L ) N ⏐1 ∑ ⏐ ∑ L −l 2 2 ⏐ ⏐ Bl lim sup ⏐ a.s. E (eh (logb Yl )) eh (logb Yn )⏐ ≤ + Re ⏐ L L L N →+∞ ⏐ N n=1 l=1 ( ) But E EBl (eh (logb Yl )) = E(eh (logb Yl )). So, for every L ≥ 1, ⎛ ⏐ ⏐ ⏐ ⏐2 ⎞ N L ⏐1 ∑ ⏐2 ∑ ⏐ ⏐ 2 L − l ⏐ ⏐ ⏐ ⏐ E ⎝lim sup ⏐ E(eh (logb Yl ))⏐ . eh (logb Yn )⏐ ⎠ ≤ + ⏐ ⏐ ⏐ L ⏐L L N →+∞ ⏐ N n=1

(2)

l=1

Suppose now that L 1∑ lim E(eh (logb Yl )) = 0. L→+∞ L l=1

Letting L tends to +∞ in (2) and applying Lemma 3.1 with al = E(eh (logb Yl )) give ⎛ ⏐ ⏐2 ⎞ N ⏐1 ∑ ⏐ ⏐ ⏐ E ⎝lim sup ⏐ eh (logb Yn )⏐ ⎠ = 0. ⏐ N →+∞ ⏐ N n=1

This proves that ⏐ ⏐2 N ⏐1 ∑ ⏐ ⏐ ⏐ lim sup ⏐ eh (logb Yn )⏐ = 0 ⏐ ⏐ N →+∞ N n=1

a.s.

So, according to Lemma 2.3, our proof is completed. □ Strangely, the above necessary and sufficient condition appears in a seemingly completely different context in [18]. The two following corollaries are direct consequences of Theorem 3.3 and Lemma 2.4. Corollary 3.4. Suppose that (X n ) is stationary and that Yn tends to be Benford in base b. Then (Yn ) is a.s. Benford in base b. Corollary 3.5. Suppose that (X n ) is stationary, that the sequence (Yn ) is a.s. Benford in base b and that (PM(Yn ) ) converges weakly to some probability measure ν. Then ν equals µb . Remark 3.1. Corollary 3.4 cannot be extended to independent non-stationary r.v. X n . Indeed consider a Benford r.v. X 1 and set X n = 1 a.s. for n ≥ 2. Then Yn = X 1 is a Benford r.v., but (Yn ) is not a.s. Benford since it is a.s. constant. Moreover, it is easy to find some stationary sequences (X n ) such that (Yn ) is a.s. Benford and Yn does not tend to be Benford. For example, we can consider i.i.d. r.v.’s X n such that PX 1 = (1/2)δb x + (1/2)δb x+1/2 where x ∈]0; 1[ is irrational. Then E(eh (logb Yn )) = e2iπ hnx when h is even and E(eh (logb Yn )) = 0 when h is odd. So (Yn ) ⏐is a.s. Benford ⏐in base b by Theorem 3.3 and Yn does not tend to be Benford in base b since ⏐ E(eh (log Yn ))⏐ = 1 when h is even. b Please cite this article in press as: N. Chenavier, et al., Products of random variables and the first digit phenomenon, Stochastic Processes and their Applications (2017), http://dx.doi.org/10.1016/j.spa.2017.08.003.

8

N. Chenavier et al. / Stochastic Processes and their Applications (

)



3.2. The second main result Theorem 3.9 is a direct consequence of Proposition 3.8 which is a slight generalization of a (surprisingly little known) result of Davenport, Erd˝os and Le Veque [8]. Proposition 3.8 gives a general condition (involving L p -norm) ensuring that the arithmetic mean of bounded random variables converges almost surely to 0. It will be used in Section 4.2.2. The result in [8] would have been enough to derive the results featuring in Section 4.2.2, but we think that our somewhat more general version may be of interest. We will use the two following lemmas. Lemma 3.6. Let (u N ) and (v N ) be two sequences of positive numbers and suppose that v N > 1 (N ≥ 1) and that (v N ) is non-decreasing. Consider an increasing sequence (Mm ) of positive integers such that v Mm Mm (3) Mm+1 ≥ v Mm − 1 and denote by Um the arithmetic mean of the numbers u Mm , u Mm +1 , . . . , u Mm+1 −1 . Then (∞ ) (∞ ) ∑ u N vN ∑ <∞ ⇒ Um < ∞ . N N =1 m=1 Proof. Fix m ≥ 1. Then, under the above conditions, Um ≤

Mm+1 Mm+1 − Mm

Mm+1 −1

∑ uN ≤ v Mm N N =M m

This concludes the proof since the numbers

Mm+1 −1 ∑ u N vN ∑ uN ≤ . N N N =M N =M

Mm+1 −1

u N vN N

m

m

are positive. □

Lemma 3.7. ∑Let (an ) be a sequence of complex numbers satisfying |an | ≤ 1 (n ≥ 1). Set N b N = (1/N ) n=1 an and consider an increasing sequence (Mm ) of positive integers such that limm (Mm+1 /Mm ) = 1. Denote by cm the arithmetic mean of the numbers b Mm , b Mm +1 , . . . , b Mm+1 −1 . Then lim

max

m→+∞ Mm ≤N
|b N − cm | = 0 .

In particular the sequences (b N ) and (cm ) are simultaneously convergent and have the same limit when they converge. Proof. Fix N0 such that Mm ≤ N0 < Mm+1 . Then ⏐ ⏐ ⏐b N − cm ⏐ ≤ 0

1 Mm+1 − Mm

Mm+1 −1

∑ ⏐ ⏐ ⏐b N − b N ⏐ . 0 N =Mm

But, for any N such that Mm ≤ N < Mm+1 , ⏐N ⏐ N 0 ⏐∑ ⏐ ∑ ⏐ ⏐ 1 N 0 ⏐ ⏐ ⏐b N − b N ⏐ = an − an ⏐ ⏐ 0 N0 ⏐ n=1 N n=1 ⏐ ⏐ ⏐ N ⏐) (⏐ N N N 0 ⏐ ⏐∑ ∑ 1 ⏐⏐∑ N0 ∑ ⏐⏐ ⏐ ⏐ ≤ an − an ⏐ + ⏐ an − an ⏐ ⏐ ⏐ ⏐ N0 ⏐ n=1 N n=1 ⏐ n=1 n=1 Please cite this article in press as: N. Chenavier, et al., Products of random variables and the first digit phenomenon, Stochastic Processes and their Applications (2017), http://dx.doi.org/10.1016/j.spa.2017.08.003.

N. Chenavier et al. / Stochastic Processes and their Applications (

)

9



2|N0 − N | N0 since 0 ≤ |an | ≤ 1. Hence ≤

⏐ ⏐ 2 ⏐b N − cm ⏐ ≤ 1 0 N0 Mm+1 − Mm

Mm+1 −1



|N0 − N |

N =Mm

Mm+1 − Mm . N0 This concludes the proof because limm (Mm+1 /Mm ) = 1. ≤2



Proposition 3.8. Let (Z n ) be ∑aN sequence of complex valued r.v. such that |Z n | ≤ 1 (n ≥ 1) and, for N ≥ 1, set TN = (1/N ) n=1 Z n . Then ) (∞ ( ) ∑ E|TN | p < +∞ for some p ≥ 1 ⇒ lim TN = 0 a.s. . N →+∞ N N =1 ∑ uN Proof. Fix p ≥ 1, set u N = E|TN | p and suppose that ∞ a non-decreasing N =1 N < ∞. Consider∑ u N vN sequence (v N ) of real numbers such∑ that v N > 1 (N ≥ ∑ 1), lim N v N = ∞ and ∞ <∞ N =1 N (as mentioned in [8], if a N > 0 and a < ∞, then a v < ∞ and lim v = ∞ when N N N N N N N ∑ a N v N = ϕ(r N ) − ϕ(r N +1 ), r N = k≥N ak , and, for example, ϕ(x) = x α with 0 < α < 1). Set v M1 = 1 and for m ≥ 1 define Mm+1 as the lowest integer greater than or equal to v MMm−1 Mm . So m

(Mm ) satisfies the hypothesis of Lemmas 3.6 and 3.7. Define the numbers Um as in Lemma 3.6 and set 1 Vm = Mm+1 − Mm

N =Mm+1 −1



|TN |

p

N =Mm

1 and Wm = Mm+1 − Mm

N =Mm+1 −1



|TN | .

N =Mm

∑ By Lemma 3.6, the series m U∑ m converges. But here Um = E(Vm ) and the r.v.’s Vm are nonnegative. So the convergence of∑ m Um and the Beppo Levi’s Monotone Convergence Theorem imply the a.s. convergence of m Vm and this proves that (Vm ) converge a.s. to 0. Hence, by p Jensen Inequality, (Wm ) converges a.s. to 0 too. Applying Lemma 3.7 with cm = Wm (ω) and b N = |TN (ω)| yields the a.s. convergence to 0 of (|TN |). The proof is completed. □ (∑n ) Taking Z n = eh (logb Yn ) = eh 1 logb X k and combining Section 2 with Proposition 3.8 yields the following theorem. (∑n ) ∑N Theorem 3.9. Set TN(h) = (1/N ) n=1 eh 1 logb X k and suppose that for all h ̸ = 0 there exists (h) p ∑ E|TN | < +∞. Then (Yn ) is a.s. Benford in base b. some p ≥ 1 such that ∞ N =1 N 4. Applications of Theorems 3.3 and 3.9 This section is devoted to some applications of Theorems 3.3 and 3.9. 4.1. When (X n ) is stationary In this section, we investigate the cases of stationary log-normal r.v.’s, exchangeable r.v.’s and stationary 1-dependent r.v.’s. Please cite this article in press as: N. Chenavier, et al., Products of random variables and the first digit phenomenon, Stochastic Processes and their Applications (2017), http://dx.doi.org/10.1016/j.spa.2017.08.003.

10

N. Chenavier et al. / Stochastic Processes and their Applications (

)



4.1.1. The case of stationary log-normal r.v.’s Suppose that (logb X n ) (or equivalently (log X n )) is a Gaussian sequence. Then, according to Lemma 2.4 and to the shape of the characteristic functions of Gaussian r.v.’s, Yn tends to be Benford if and only if ∑ lim Cov(log X k , log X l ) = +∞. n→+∞

1≤k, l≤n

This is very often the case. For example when log X n = Wtn and (Wt )t is a Brownian motion or a Brownian bridge and (tn ) is any sequence of indexes. If we suppose in addition that (log X n ) is stationary, then the above necessary and sufficient condition becomes lim (nγ (1) + 2(n − 1)γ (2) + · · · + 2γ (n)) = +∞

n→+∞

(4)

where γ (k) = Cov(log X 1 , log X k ) (k = 1, . . . , n). Therefore Theorem 3.3 proves the following. Proposition 4.1. If (log X n ) is a stationary Gaussian sequence and satisfies (4), then Yn tends to be Benford and (Yn ) is a.s. Benford. The sequence (X n ) satisfies Condition (4) especially when γ (n) ≥ 0 for n ≥ 2 (for instance when log X n = Otn and (Ot )t is an Ornstein–Uhlenbeck process) and also when the numbers γ (k) are summable [10, p. 215]. 4.1.2. The case of exchangeable r.v.’s In this section, we set X n = g(U, Z n )

(5)

where the r.v.’s Z n are i.i.d., the r.v. U is independent of the sequence (Z n ) and g is any positive measurable function. Proposition 4.2. If the r.v.’s X n satisfy (5) and if for every positive integer h ⏐ ({ ⏐ }) PU u : ⏐ E(eh (logb g(u, Z 1 )))⏐ < 1 = 1,

(6)

then Yn tends to be Benford in base b and (Yn ) is a.s. Benford in base b. Proof. By Theorem 3.3 and Lemma 2.4 we only need to prove that limn E(eh (logb Yn )) = 0 for all h ̸= 0. Fix h ̸= 0 and n ≥ 1 and suppose that X 1 , . . . , X n satisfy (5). According to the conditions on U and (Z n ), E(eh (logb Yn )) = E[eh (logb g(U, Z 1 )) × · · · × eh (logb g(U, Z n ))] ∫ = E[eh (logb g(U, Z 1 )) × · · · × eh (logb g(U, Z n ))/U = u]d PU (u) ∫R = E[eh (logb g(u, Z 1 ))]n d PU (u). R

Therefore ∫ |E(eh (logb Yn ))| ≤

|E[eh (logb g(u, Z 1 ))]|n d PU (u).

R

But |E[eh (logb g(., Z 1 ))]|n converges PU -a.s. to 0, as n → +∞, when (6) holds. This and the Dominated Convergence Theorem conclude the proof. □ Please cite this article in press as: N. Chenavier, et al., Products of random variables and the first digit phenomenon, Stochastic Processes and their Applications (2017), http://dx.doi.org/10.1016/j.spa.2017.08.003.

N. Chenavier et al. / Stochastic Processes and their Applications (

)



11

Remark 4.1. Condition (6) in Proposition 4.2 is fulfilled in particular when g(u, Z 1 ) admits a density for PU -almost-all u. For example when Z 1 admits a density, g is of class C 1 and, for (u, .) is finite. PU -almost-all u, the set of zeros of ∂g ∂z 4.1.3. The case of stationary 1-dependent r.v.’s For the sake of brevity, we only deal with the case where the r.v.’s X n are 1-dependent. However, our results can be extended to general m-dependence. In this section we suppose that X n = g(Z n , Z n+1 )

(7)

where the r.v.’s Z n are i.i.d. and g is any positive measurable function. Proposition 4.3. If the r.v.’s X n satisfy (7) and if for every positive integer h ⏐ ⏐ }) ({ P(Z 1 ,Z 3 ) (z 1 , z 3 ) : ⏐ E[eh (logb (g(z 1 , Z 2 ))g(Z 2 , z 3 ))]⏐ < 1 > 0,

(8)

then Yn tends to be Benford in base b and (Yn ) is a.s. Benford in base b. Proof. Again we only need to prove that limn E(eh (logb Yn )) = 0 for all h ̸= 0. Fix h ̸= 0 and n ≥ 1 and suppose that X 1 , . . . , X n satisfy (7). To begin with, assume that n = 4m where m is some positive integer. Set V := (Z 1 , Z 3 , . . . , Z 4m+1 ), v := (z 1 , z 3 , . . . , z 4m+1 ), ϕ := eh (logb g) and ψ(x, z, y) := eh (logb (g(x, z)g(z, y))) = ϕ(x, z)ϕ(z, y) = logb (G(x, y, z)). Since the Z n are i.i.d., ∫ E(eh (logb Yn )) =

2m+1

∫R =

2m+1

∫R =

R2m+1

E[ϕ(Z 1 , Z 2 ) × ϕ(Z 2 , Z 3 ) × · · · × ϕ(Z 4m , Z 4m+1 )/V = v]d PV (v) E[ψ(z 1 , Z 2 , z 3 ) × · · · × ψ(z 4m−1 , Z 4m , z 4m+1 )]d PV (v) E[ψ(z 1 , Z 2 , z 3 )] × · · · × E[ψ(z 4m−1 , Z 4m , z 4m+1 )]d PV (v).

The expectations appearing in the last integral are bounded in modulus by 1 and the Z n are i.i.d. Hence Fubini’s Theorem leads to ∫ | E(eh (logb Yn ))| ≤ | E[ψ(z 1 , Z 2 , z 3 )]| × | E[ψ(z 5 , Z 6 , z 7 )]| × . . . R2m+1

× | E[ψ(z 4m−3 , Z 4m−2 , z 4m−1 )]|d P(Z 1 ,Z 3 ,...,Z 4m+1 ) (z 1 , z 3 , . . . , z 4m+1 ) (∫ )m ≤ | E[ψ(z 1 , Z 2 , z 3 )]|d P(Z 1 ,Z 3 ) (z 1 , z 3 ) . R2

Suitably modified, the above calculations still yield (∫ )m | E(eh (logb Yn ))| ≤ | E[ψ(z 1 , Z 2 , z 3 )]|d P(Z 1 ,Z 3 ) (z 1 , z 3 ) R2

when n = 4m + k with k ∈ {1, 2, 3}. To complete our proof we now demonstrate that ∫ | E[ψ(z 1 , Z 2 , z 3 )]|d P(Z 1 ,Z 3 ) (z 1 , z 3 ) < 1 R2

Please cite this article in press as: N. Chenavier, et al., Products of random variables and the first digit phenomenon, Stochastic Processes and their Applications (2017), http://dx.doi.org/10.1016/j.spa.2017.08.003.

12

N. Chenavier et al. / Stochastic Processes and their Applications (

)



when (8) is fulfilled. Set A = {(z 1 , z 3 ) : |E[ψ(z 1 , Z 2 , z 3 )]| < 1} and Aa = {(z 1 , z 3 ) : |E[ψ(z 1 , Z 2 , z 3 )]| ≤ a} . The sequence (A1/n ) is non-decreasing and A =



n A1/n .

So

P(Z 1 ,Z 3 ) (A) = lim P(Z 1 ,Z 3 ) (A1/n ). n→+∞

If P(Z 1 ,Z 3 ) (A) > 0, there exists therefore a < 1 such that Aa satisfies P(Z 1 ,Z 3 ) (Aa ) > 0. Hence ∫ ∫ | E[ψ(z 1 , Z 2 , z 3 )]|d P(Z 1 ,Z 3 ) (z 1 , z 3 ) | E[ψ(z 1 , Z 2 , z 3 )]|d P(Z 1 ,Z ,3) (z 1 , z 3 ) = R2 \Aa R2 ∫ | E[ψ(z 1 , Z 2 , z 3 )]|d P(Z 1 ,Z 3 ) (z 1 , z 3 ) + Aa

≤P(Z 1 ,Z 3 ) (R2 \ Aa ) + a P(Z 1 ,Z 3 ) (Aa ) is strictly less than 1. □ Remark 4.2. The r.v.’s X n satisfy Condition (8) in Proposition 4.3 in particular when G(x, y, Z 2 ) := g(x, Z 2 )g(Z 2 , y) admits a density for (x, y) in a set of positive P(Z 1 ,Z 3 ) -measure. For example when Z 2 admits a density, g is of class C 1 and, for (x, y) in a set of positive P(Z 1 ,Z 3 ) -measure, the set of zeros of ∂G (x, y, .) is finite. ∂z 4.2. When the X n are independent All the results of this section rely on the following proposition which is a direct consequence of Lemma 2.4 (see also [27]). Proposition 4.4. If the X n are independent, then Yn tends to be Benford in base b if and only if ∀ h ∈ Z+ , lim

n→+∞

n ∏ ⏐ ⏐ ⏐E(eh (log X k ))⏐ = 0 . b k=1

Moreover, if X n 0 is a Benford r.v. for some n 0 ≥ 1, then Yn is a Benford r.v. for all n ≥ n 0 . 4.2.1. A general criterion ensuring that Yn tends to be Benford The following proposition is not an application of Theorem 3.9. It shows that Yn tends to be Benford in most cases when the X n are independent (see [23] for quite similar results). When a sequence (Z n ) of r.v. is such that (PZ n ) is tight, we will say that (Z n ) itself is tight. Moreover, we will say that the sequence (Z n ) satisfies • Condition (C1 ) if (Z n ) does not admit any subsequence which converges in distribution to z a r.v. supported by some set {ba+ h : z integer} (a ∈ [0, 1[, h ∈ Z+ ); • Condition (C2 ) if (M(Z n )) does not admit any subsequence which converges in distribuz tion to a r.v. supported by some set {ba+ h : z = 0, 1, . . . , h − 1} (a ∈ [0, 1[, h ∈ Z+ ). Please cite this article in press as: N. Chenavier, et al., Products of random variables and the first digit phenomenon, Stochastic Processes and their Applications (2017), http://dx.doi.org/10.1016/j.spa.2017.08.003.

N. Chenavier et al. / Stochastic Processes and their Applications (

)



13

Proposition 4.5. If the X n are independent and (X n ) possesses a tight subsequence satisfying Condition (C1 ), then Yn tends to be Benford. The same is true if the X n are independent and (X n ) possesses a subsequence satisfying Condition (C2 ). Proof. Suppose that the subsequence (X nl )l is tight. By the Helly Selection Theorem, some subsequence (X n j ) j of (X nl )l converges in distribution to some r.v. Z and this leads to convergence in distribution of (logb X n j ) j to logb Z . Suppose now that (X nl )l satisfies Condition (C1 ). Then |E(eh (logb Z ))| < 1 for all h ∈ Z+ . Fix h ∈ Z+ . Since lim |E(eh (logb X n j ))| = |E(eh (logb Z ))|,

j→+∞

there exist ε > 0 and j0 ≥ 1 such that |E(eh (logb X n j ))| ≤ 1 − ε for j ≥ j0 . This yields m ∏ ⏐ ⏐ ⏐E(eh (log X n ))⏐ = 0 lim b j

m→+∞

j=1

which implies lim

n→+∞

n ∏ ⏐ ⏐ ⏐E(eh (log X k ))⏐ = 0. b k=1

Thus, when the X n are independent, Proposition 4.4 completes the proof of the first assertion. Consider again a subsequence (X nl )l of (X n ). The sequence (M(X nl ))l is tight by nature since it is uniformly bounded. Thus some subsequence (M(X n j )) j of (M(X nl ))l converges in distribution to some r.v. Z with values in [1, b[ and this leads to the convergence in distribution of ({logb X n j }) j to logb Z . If (X nl )l satisfies Condition (C2 ), we can conclude with the same arguments as above. □ In what follows, we consider r.v.’s X n having densities. This will permit us to get explicit bounds for the Fourier coefficients of the r.v.’s logb X n and thus to make use of both Proposition 4.4 and Theorem 3.9. 4.2.2. A general bound on Fourier coefficients and applications Several bounds of the characteristic function are available in the literature, but we are only interested in Fourier coefficients which are easier to investigate. We give below a simple bound which has the advantage of being uniform in h. Lemma 4.6. If a r.v. Z admits a density √ supported by the interval [0, 1[ and bounded from above by a real a ≥ 1, then |E(eh (Z ))| ≤ 1 − 1/4a 2 for all h ∈ Z+ . Proof. Let Z be such a r.v. Fix h ∈ Z+ . If Z 1 and Z 2 denote two independent r.v having the same density as Z , then ∫ 0 ≤ | E(eh (Z ))|2 = E(eh (Z 1 − Z 2 )) = f (x) cos(2hπ x) d x [−1;1]

where f is the density of Z 1 − Z 2 . Note that f ≤ a too and that a ≥ 1. For every l ∈ [−1, 1] set Bl = {x ∈ [−1; 1] : cos(2hπ x) ≥ l}. Please cite this article in press as: N. Chenavier, et al., Products of random variables and the first digit phenomenon, Stochastic Processes and their Applications (2017), http://dx.doi.org/10.1016/j.spa.2017.08.003.

14

N. Chenavier et al. / Stochastic Processes and their Applications (

)



Let L be such that the Lebesgue measure of B L is 1/a. Then ∫ ∫ f (x) d x = (a − f (x)) d x. B Lc

BL

Hence, if we set g = a1 BL , ∫ ∫ ∫ f (x) cos(2hπ x) d x ≤ f (x) cos(2hπ x) d x + L f (x) d x [−1;1] BL B Lc ∫ ∫ = f (x) cos(2hπ x) d x + L (a − f (x)) d x ∫ BL ∫ BL (a − f (x)) cos(2hπ x) d x f (x) cos(2hπ x) d x + ≤ BL ∫BL g(x) cos(2hπ x) d x. = [−1;1]

Direct calculations and the inequality sin x ≤ x − x 3 /π 2 (0 ≤ x ≤ π ) give ∫ 2a π 1 g(x) cos(2hπ x) d x = sin ≤1− 2 . π 2a 4a [−1;1] The proof is completed. □ One can replace “sin x ≤ x − x 3 /π 2 (0 ≤ x ≤ π )” with “sin x ≤ x − x 3 /7 (0 ≤ x ≤ π/2)” in the above proof. This leads to a better (but less simple) bound for |E(eh (Z ))|, but has no impact on the statement of Proposition 4.8 which is an application of Lemma 4.6 in situations where M(X n ) admits a density. Its proof uses the following lemma whose proof is elementary. Lemma 4.7. Let x1 , . . . , x N be real numbers. Then ⏐2 ⏐ N ⏐ ⏐1 ∑ ( ) ∑ 2 1 ⏐ i(x1 +···+xn ) ⏐ + 2 Re ei (xk+1 +···+xn ) . e ⏐ = ⏐ ⏐ ⏐N N N 1≤k
n=1

Proposition 4.8. Suppose that each r.v. M(X n ) admits a bounded density f n . Set cn = sup f n (x) and C N = max1≤n≤N cn (n ∑ ≥ 1, N ≥ 1). Then Yn tends to be Benford if ∑ 1≤x
If ck( does not tend )to infinity as k → +∞, then a subsequence of (ck ) is) bounded and so ( 1 log 1 − 4(b log b)2 c2 = −∞. If limk ck = +∞, then log 1 − 4(b log1 b)2 c2 is equivalent to k

k

Please cite this article in press as: N. Chenavier, et al., Products of random variables and the first digit phenomenon, Stochastic Processes and their Applications (2017), http://dx.doi.org/10.1016/j.spa.2017.08.003.

N. Chenavier et al. / Stochastic Processes and their Applications (

(

)



15

)

∑ log 1 − 4(b log1 b)2 c2 = −∞ if and only if − 4(b log1 b)2 c2 as k → +∞. Thus, in this case, k k ∑ (1/ck2 ) = +∞. The proof of the first assertion is completed. To ∑ prove the second assertion, we will make use of Theorem 3.9 with p = 2. Set TN = N (1/N ) n=1 eh (logb X n ) and A N = max1≤n≤N |E(eh (logb X n ))| (N ≥ 1). By Lemma 4.7, for every N ≥ 1, ∑ 1 2 E|TN |2 ≤ + 2 |E(eh (logb X k+1 ))| · · · |E(eh (logb X n ))| N N 1≤k
Remark 4.3. There is at least one alternative way to prevent |E(eh (logb X n ))| from being too close to 1: limiting the density of M(X n ) from below. A mild adaptation of the arguments used in the proof of Lemma 4.6 and of Proposition 4.8 yields that if we replace cn ∑ = sup1≤x
4.2.3. When the r.v. X n or logb X n is unimodal In the next corollary, which is a consequence of Proposition 4.8, we get a bound for the density of M(X n ) by assuming that logb X n or the positive r.v. X n itself admits a unimodal density (that is to say, a density having a single local maximum). In the event that logb X n admits a unimodal density, we can choose the law of X n among all the log-stable distributions, and many others since we do not impose the symmetry of the densities. The case where X n itself admits a unimodal density concerns many usual distributions supported by ]0, +∞[: exponential, Fisher– Snedecor, gamma, chi-squared, beta (some of them), Weibull, and so on. Note that Corollary 4.9 does not require any hypothesis on the value or the existence of moments. Corollary 4.9. Suppose that each r.v. logb X n or each X n admits a unimodal density gn and set ∑ (1/dn2 ) = +∞ dn = supx gn (x) and D N = max∑ 1≤n≤N dn . Then Yn tends to be Benford when and (Yn ) is a.s. Benford when (D 2N /N 2 ) < +∞. This is the case in particular when the densities gn are uniformly bounded. Proof. Fix n ≥ 1 and h ∈ Z+ . Assume that log gn bounded above ⏐ b X n admits a ⏐unimodal density ⏐ ⏐ ⏐ ⏐ ⏐ E(e (a + logb X n ))⏐ = by d . Since we are concerned solely with E(e (log X )) and since h n h b n ⏐ ⏐ ⏐E(eh (log X n ))⏐ (a ∈ R), we can assume, without loss of generality, that the mode of log X n is b b Please cite this article in press as: N. Chenavier, et al., Products of random variables and the first digit phenomenon, Stochastic Processes and their Applications (2017), http://dx.doi.org/10.1016/j.spa.2017.08.003.

16

N. Chenavier et al. / Stochastic Processes and their Applications (

)



an integer k0 . Fix x ∈ [0, 1[. The r.v. {logb X n } admits a density gn∗ given by gn∗ (x) =

+∞ ∑

gn (m + x).

−∞

Moreover gn (m) ≤ gn (m + x) ≤ gn (m + 1)

if m < k0

gn (m + 1) ≤ gn (m + x) ≤ gn (m)

if m ≥ k0 .

and

Hence k0 −1



gn (m) +

−∞

+∞ ∑

gn (m + 1) ≤ gn∗ (x) ≤

k0

k0 ∑

gn (m) +

−∞

+∞ ∑

gn (m)

k0

which yields gn∗ (0) − dn ≤ gn∗ (x) ≤ gn∗ (0) + dn . Integrating over [0, 1[ the three members of the above formula give gn∗ (0) − dn ≤ 1 ≤ gn∗ (0) + dn . Thus gn∗ (x) ≤ 1 + 2dn and this implies that M(X n ) admits by (1 + 2dn )/ log b. ∑ a density bounded 2 By Proposition∑ 4.8, Yn tends to be Benford when (1/(1 + 2dn )∑ ) = +∞ and (Yn ) is a.s. Benford ((1 + 2D 2N )/N 2 ) < +∞. This is the case when (1/dn2 ) = +∞ and when ∑ 2 when 2 (D N /N ) < +∞, respectively (whether limn dn = +∞ or not). Assume now that X n itself admits a unimodal density gn bounded above by dn . We will assume, without loss of generality, that the mode of X n is bk0 where k0 is an integer. Fix x ∈ [1, b[. The r.v. M(X n ) admits a density gn∗ given by gn∗ (x) =

+∞ ∑

gn (bm + x)

−∞

and we conclude following the same lines as above. □ Note that the r.v.’s involved in Corollary 4.9 are allowed to converge in distribution to a r.v. z supported by some set {ba+ h : z integer} (a ∈ [0, 1[, h ∈ Z+ ). Hence Corollary 4.9 is not a consequence of Proposition 4.5. Appendix We present here a survey of the main known results on Benford r.v.’s and of results which may be new but are easily deduced from known techniques. All the proofs below use Fourier Analysis and several of them are simpler and shorter than the original ones (see [4] for more basic facts on Benford’s law). Roughly speaking, there exist very few Benford r.v.’s (see Section 2.2 and Appendix A.4), but many r.v.’s are close to be Benford [11,22] because many r.v.’s are close to be u.d. (1). Indeed, if limt→∞ E(exp(2iπt Z )) = 0 (this holds in particular when the law of Z is absolutely continuous), then limσ →∞ E(eh (σ Z )) = limσ →∞ E(ehσ (Z )) = 0 for every h ∈ Z+ . This shows that X = bσ Z is close to be Benford in base b for sufficiently large σ . For example, e Z is close to be Benford in Please cite this article in press as: N. Chenavier, et al., Products of random variables and the first digit phenomenon, Stochastic Processes and their Applications (2017), http://dx.doi.org/10.1016/j.spa.2017.08.003.

N. Chenavier et al. / Stochastic Processes and their Applications (

)



17

any base when Z is an exponential or a Weibull r.v. with sufficiently small parameter λ. Besides, Z itself is close to be Benford in any base when, among many other situations, Z is a log-normal or log-Cauchy r.v. provided that the dispersion parameter of the corresponding normal or Cauchy distribution is sufficiently large (see also Section 1.3). A.1. Scale-invariance The scale-invariance of the law of the mantissa of a random variable is a distinctive property of µb and, historically, it is the reason why µb has been chosen to depict the First Digit Phenomenon. This property is equivalent to the invariance by translation of the Lebesgue measure on the circle. The following property has been stated, sometimes in a less precise form, by several authors and is proved, as stated below, by Hill [17] via techniques involving the σ -algebra generated by the mantissa function. We give a short and original proof using Fourier analysis. Proposition A.1. Let X be a positive random variable. The three following conditions are equivalent : 1. X is Benford in base b; 2. for every λ > 0, PMb (X ) = PMb (λX ) ; 3. for some λ > 0 different from any root of b, PMb (X ) = PMb (λX ) . Proof. Let X be a positive random variable and λ be a positive real number. Then, for every h ∈ Z+ , E(eh (logb (λX ))) = eh (logb λ)E(eh (logb X )) . So, by Lemma 2.4, Condition 1 implies Condition 2. Moreover, the above formula and Condition 3 imply E(eh (logb X )) = eh (logb λ)E(eh (logb X )) . Since eh (logb λ) ̸= 1 when h ∈ Z+ and λ is not any root of b, this implies Condition 1. □ A.2. Base-invariance and power-invariance We must distinguish between the notion of base-invariance considered in [20] (called baseinvariance in the sequel) from the one studied in [17] (called Hill b-base-invariance in the sequel). The first one is defined by ∀ b′ > 1, ∀ b′′ > 1, PMb′ (X ) = PMb′′ (X ) . The second one is defined by ∀ n ∈ Z+ , PMb1/n (X ) = PMb (X ) where b > 1 is fixed. A.2.1. Base-invariance Knuth [20, Exercise 7 pp. 248, 576] has proved by skilled calculations that scale-invariance and base-invariance are incompatible. Since the scale-invariance characterizes the Benford Please cite this article in press as: N. Chenavier, et al., Products of random variables and the first digit phenomenon, Stochastic Processes and their Applications (2017), http://dx.doi.org/10.1016/j.spa.2017.08.003.

18

N. Chenavier et al. / Stochastic Processes and their Applications (

)



random variables, this implies that the Benford random variables cannot satisfy the baseinvariance property. The following proposition is a little bit more precise than the Knuth one and its proof is simple. Proposition A.2. If X is base-invariant, then PX = δ1 and so X cannot be Benford in any base. Proof. Suppose that X is base-invariant and fix h ∈ Z+ and b′ > 1. Lemma 2.4 gives ∀ b′′ > 1, E(eh (logb′ X )) = E(eh (logb′′ X )) = φ(h/ log b′′ ) where φ is the characteristic function of log X . Besides, φ is continuous and satisfies φ(0) = 1. Hence, letting b′′ tend to infinity, we get E(eh (logb′ X )) = 1, and this is true for any h ∈ Z+ . According to the Levy’s Theorem on the torus (see Section 2.2), this implies that P{logb′ X } = δ0 and then PMb′ (X ) = δ1 . So PX is supported by the set {1, b′ , (b′ )2 , . . . } and, since X is supposed to be base-invariant, this must be true for every b′ > 1. This is impossible unless X = 1 a.s. □ A.2.2. Hill b-base-invariance and power-invariance The following proposition has been proved by Hill [17]. We give below an original and short proof. Proposition A.3. A positive absolutely continuous random variable is Hill b-base-invariant if and only if it is Benford in base b. Proof. Let X be a positive random variable. Then, for every h ∈ Z+ and n ∈ Z+ , E(eh (logb1/n X )) = E(ehn (logb X )) . 1

So, Lemma 2.4 shows that, if X is Benford in base b, it is also Benford in base b n for every n ∈ N∗ and this implies that PMb1/n (X ) = PMb (X ) . Conversely, if we suppose that X is Hill b-base-invariant, the above formula gives E(eh (logb X )) = E(ehn (logb X ))

(n ≥ 1, h ∈ Z+ ).

If we assume, in addition, that X is absolutely continuous, then the Riemann–Lebesgue Theorem says that lim E(ehn (logb X )) = 0 n

(h ∈ Z+ ).

Together with Lemma 2.4, this proves that X is Benford in base b. □ Due to Lemma 2.4, it is easy to verify that X is Benford in base b if and only if the same fact holds for 1/ X . So, since logb1/n x = logb x n (x > 0 and n ∈ Z+ ), we can rewrite the above proposition as follows. Proposition A.4. If a positive random variable X is Benford in base b, then, for every m ∈ Z+ , X m is also Benford in base b. Conversely, if an absolutely continuous positive random variable X satisfies ∀ n ∈ N∗ , PMb (X n ) = PMb (X ) , then X is Benford in base b.

Please cite this article in press as: N. Chenavier, et al., Products of random variables and the first digit phenomenon, Stochastic Processes and their Applications (2017), http://dx.doi.org/10.1016/j.spa.2017.08.003.

N. Chenavier et al. / Stochastic Processes and their Applications (

)



19

A.3. Product-invariance The following proposition generalizes the scale-invariance because the constant λ appearing in Appendix A.1 can be viewed as a random variable independent of X . It slightly generalizes Theorem 2.3 in [14]. Note that the authors of [14] suppose, in their abstract, that PX is supported by a finite interval, but they do not use this hypothesis in the proof of their theorem which follows the same lines as the following one. Proposition A.5. Let X and Y be two independent positive random variables. If X (or Y ) is Benford in base b, then X Y is Benford in base b too. Conversely, if PX is not supported by any z set {b h : z integer} (h positive integer) and if PMb (Y ) = PMb (X Y ) , then Y is Benford in base b. Proof. Let h ∈ Z+ and suppose that X and Y are independent. Then E(eh (logb (X Y ))) = E(eh (logb X ))E(eh (logb Y )) . If X is Benford in base b, Lemma 2.4 implies E(eh (logb X )) = 0 and this gives the first part of the z proposition. Conversely, if PX is not supported by any set {b h : z integer} (h positive integer), then E(eh (logb X )) ̸= 1 and so E(eh (logb (X Y ))) and E(eh (logb Y )) cannot be equal unless they are equal to zero. □ A.4. Mixtures When X and Y are independent, the conditional law of X Y given (Y = a) is the law of a X . So PX Y can be viewed as a mixture of the laws Pa X (a > 0). Theorem 2.3 in [14] states that, if X is continuous, PY and the mixture PX Y lead to the same mantissa law in base b if and only if PMb (Y ) = µb (this is the converse part of Proposition A.5). Hence such a mixture is rarely the Benford’s law (see [16] for a similar and more sophisticated property). But, Proposition A.5 also shows that, whatever PY is, PM(X Y ) = µb when X is Benford in base b and X and Y are independent. In other words, any mixture (following the above procedure) of laws of Benford random variables in base b is the law of a Benford random variable in base b. This property extends to general mixtures. References [1] T.C. Anderson, L. Rolen, R. Stoehr, Benford’s law for coefficients of modular forms and partition functions, Proc. Amer. Math. Soc. 139–5 (2011) 1533–1541. [2] F. Benford, The law of anomalous numbers, Proc. Am. Phil. Soc. 78 (1938) 551–572. [3] A. Berger, L.A. Bunimovich, T.P. Hill, One-dimensional dynamical systems and Benford’s law, Trans. Amer. Math. Soc. 357 (1) (2005) 197–219. [4] A. Berger, T.P. Hill, A basic theory of Benford’s law, Probab. Surv. 8 (2011) 1–126. [5] A. Berger, T.P. and Hill, An Introduction To Benford’s Law, Princeton University Press, Princeton, 2015. [6] P. Billingsley, Probability and Measure, Wiley, New-York, 1979. [7] D.I. Cohen, T.M. Katz, Prime numbers and the first digit phenomenon, J. Number Theory 18–3 (1984) 261–268. [8] H. Davenport, P. Erd˝os, W.J. Le Veque, On Weyl’s criterion for uniform distribution, Michigan Math. J. 10 (1963) 311–314. [9] P. Diaconis, The Distribution of Leading Digits and Uniform Distribution mod 1, Ann. Probab. 5–1 (1977) 72–81. [10] P. Doukhan, Probabilistic and Statistical Tools for Modeling Time Series, in: Colóquio Brasiliero de Matemàtica, Publicações Matemàticas, 2015. [11] L. Dümbgen, C. Leuenberger, Explicit bounds for the approximation error in Benford’s law, Electron. Comm. Probab. 13 (2008) 99–112.

Please cite this article in press as: N. Chenavier, et al., Products of random variables and the first digit phenomenon, Stochastic Processes and their Applications (2017), http://dx.doi.org/10.1016/j.spa.2017.08.003.

20

N. Chenavier et al. / Stochastic Processes and their Applications (

)



[12] S. Eliahou, B. Massé, D. Schneider, On the mantissa distribution of powers of natural and prime numbers, Acta Math. Hungar. 139 (1–2) (2013) 49–63. [13] A. Fuchs, G. Letta, Le problème du premier chiffre décimal pour les nombres premiers, Electron. J. Combin. 3–2 (1996) R25 French) [The first digit problem for primes]. [14] R. Giuliano, E. Janvresse, A unifying probabilistic interpretation of Benford’s law, Unif. Distrib. Theory 2 (2010) 169–182. [15] R. Hamming, On the distribution of numbers, Bell Syst. Tech. J. 49 (1976) 1609–1625. [16] T.P. Hill, A statistical derivation of the significant digit law, Statist. Sci. 10 (4) (1995a) 354–363. [17] T.P. Hill, Base-invariance implies Benford’s law, Proc. Amer. Math. Soc. 123 (1995b) 887–895. [18] P. Holewijn, Note on Weyl criterion and the uniform distribution of independent random variables, Ann. Math. Stat. 40–3 (1969) 1124–1125. [19] E. Janvresse, T. de la Rue, From uniform distributions to Benford’s law, J. Appl. Probab. 41 (2004) 1203–1210. [20] D.E. Knuth, The Art of Computer Programming, Vol. 2, Addison-Wesley, Reading, Massachusetts, 1968. [21] L. Kuipers, H. Niederreiter, Uniform Distribution of Sequences, Dover Publications, New-York, 2006. [22] L.M. Leemis, B.W. Schmeiser, D.L. Evans, Survival distributions satisfying benford’s law, Amer. Statist. 54 (2000) 1–6. [23] P. Lévy, L’addition des variables aléatoires définies sur une circonférence, Bull. Soc. Math. France 67 (1939) 1–41. [24] B. Massé, D. Schneider, Random number sequences and the first digit phenomenon, Electron. J. Probab. 17 (2012) 1–17. [25] B. Massé, D. Schneider, Fast growing sequences of numbers and the first digit phenomenon, Int. J. Number Theory 11–3 (2015) 705–719. [26] S.J. Miller (Ed.), Theory and Applications of Benford’s Law, Princeton University Press, 2015. [27] S.J. Miller, M.J. Nigrini, The modulo one Central Limit Theorem and Benford’s law for products, Internat. J. Algebra 2–3 (2008) 119–130. [28] M.J. Nigrini, S.J. Miller, Benford’s law applied to hydrology data - Results and relevance to other geophysical data, Math. Geol. 39–5 (2007) 469–490. [29] M.J. Nigrini, L.J. Mittermaier, The use of Benford’s law as an aid in analytical procedures, Audit.-J. Pract. Theory 16 (2) (1997) 52–57. [30] P.N. Posch, A survey on sequences and distribution functions satisfying the first-digit-law, Int. J. Stat. Manag. Syst. 11–1 (2008) 1–19. [31] H. Robbins, On the equidistribution of sums of independent random variables, Proc. Amer. Math. Soc. 4 (1953) 786–799. [32] M. Sambridge, H. Tkal˘ci´c, A. Jackson, Benford’s law in the natural sciences, Geophys. Res. Lett. 37 (2010) L22301. [33] P. Schatte, On a law of the iterated logarithm for sums mod 1 with application to Benford’s law, Probab. Theory Related Fields 77 (2) (1988a) 167–178. [34] P. Schatte, On mantissa distributions in computing and Benford’s law (German, Russian summary), J. Inf. Process. Cybern. 24 (1988b) 443–455. [35] P. Schatte, On a uniform law of the iterated logarithm for sums mod 1 and Benford’s law (Russian, Lithuanian summary), Litov. Matematicheskii Sb. 31–1 (1991) 205–217 translation in Lithuanian Math. J 31-1, 133–142. [36] M.J. Sharpe, Limit laws and mantissa distributions, Probab. Math. Statist. 26–1 (2006) 175–185. [37] B. Xu, J. Wang, G. Liu, Y. Dai, Photorealistic computer graphics forensics based on leading digit law, J. Electron. 28 (1) (2011) 95–100.

Please cite this article in press as: N. Chenavier, et al., Products of random variables and the first digit phenomenon, Stochastic Processes and their Applications (2017), http://dx.doi.org/10.1016/j.spa.2017.08.003.