Discrete Applied Mathematics xxx (xxxx) xxx
Contents lists available at ScienceDirect
Discrete Applied Mathematics journal homepage: www.elsevier.com/locate/dam
Solving discrete logarithm problems faster with the aid of pre-computation ∗
Jin Hong a , , Hyeonmi Lee b a b
Department of Mathematical Sciences and RIM, Seoul National University, Seoul 08826, Republic of Korea Department of Mathematics and Research Institute for Natural Sciences, Hanyang University, Seoul 04763, Republic of Korea
article
info
Article history: Received 4 August 2017 Received in revised form 8 January 2019 Accepted 24 March 2019 Available online xxxx Keywords: Discrete logarithm problem Pollard’s rho algorithm Distinguished point Trapdoor discrete logarithm group Fuzzy Hellman
a b s t r a c t A trapdoor discrete logarithm group is an algebraic structure in which the feasibility of solving discrete logarithm problems depends on the possession of some trapdoor information, and this primitive has been used in many cryptographic schemes. The current designs and applications of this primitive are such that the practicality of its use is greatly increased by methods that allow for discrete logarithm problems of sizes that are barely solvable to be solved faster. In this article, we propose an algorithm that can reduce the time taken to solve discrete logarithm problems through a one-time precomputation process. We also provide a careful complexity analysis of the algorithm and compare its performance with those of existing algorithms for solving discrete logarithm problems with the aid of pre-computation. Our new method performs much better than the most widely known algorithm and is advantageous over a more recently proposed method in view of pre-computation cost. © 2019 Elsevier B.V. All rights reserved.
1. Introduction The Pollard’s rho algorithm [33] enables one to solve a discrete logarithm problem (DLP) in a cyclic group of size q with √ computational complexity of Θ ( q) order. There are also algorithms which can solve a DLP with online complexity smaller √ √ than Θ ( q), after a pre-computation phase of complexity larger than Θ ( q). This paper presents a new pre-computation aided DLP solving algorithm and a theoretical analysis of its performance. The advantage of our algorithm over existing pre-computation aided DLP solving algorithms will be by a small multiplicative factor. However, this seemingly small advantage will nevertheless be of great value in practice. For example, reducing the storage requirement by half (without affecting the pre-computation cost or the online time) could be critical in the use of these algorithms on resource-constrained devices such as smartphones, while reducing the online time by half could have quite an impact on user experience. Furthermore, the possibility of reducing the pre-computation cost by half with no negative effect to online performance would be of great value, as the pre-computation phases of these algorithms typically require extreme resources. Our theoretical analysis of the algorithm performance will also serve as a valuable tool in making parameter choices for our algorithm. This is very important, since the costly pre-computation phase precludes the trial-and-error approach to making specific design choices for a system that uses the algorithm. ∗ Corresponding author. E-mail addresses:
[email protected] (J. Hong),
[email protected] (H. Lee). https://doi.org/10.1016/j.dam.2019.03.023 0166-218X/© 2019 Elsevier B.V. All rights reserved.
Please cite this article as: J. Hong and H. Lee, Solving discrete logarithm problems faster with the aid of pre-computation, Discrete Applied Mathematics (2019), https://doi.org/10.1016/j.dam.2019.03.023.
2
J. Hong and H. Lee / Discrete Applied Mathematics xxx (xxxx) xxx
Motivation. There are many cryptographic schemes [13,17,25,27,32,43] that rely on a primitive referred to as the trapdoor discrete logarithm group, which is an algebraic structure in which the feasibility of solving DLPs depends on the possession of some trapdoor information. There are multiple approaches [7,10,11,17,31,43,44] to the construction of a trapdoor discrete logarithm group, but let us focus on the RSA setting [24,25,32], which deals with DLPs in the group Z× n , for a composite integer n. Below, we will assume an RSA-type modulus of n = uv and use q to denote the largest among the prime factors of u − 1 and v − 1. Given a specific DLP in Z× n , someone with knowledge of u and v can make standard uses of the Chinese Remainder Theorem and the Pohlig–Hellman algorithm to reduce the DLP to multiple DLPs in smaller prime order groups and solve √ the given problem with essentially Θ ( q)-many modular multiplications. Hence, the trapdoor discrete logarithm group becomes more practical to use as q is made smaller. On the other hand, since the Pollard’s p − 1 factorization algorithm can be used to obtain u and v from n with computations of very roughly Θ (q) order, the need for security prevents one from letting q become too small. For the trapdoor discrete logarithm group of the RSA setting to be meaningful, one must be able to choose q in such √ a way that the computations of Θ ( q)-order are within reach of the intended user, while computations of Θ (q)-order are infeasible by the adversary. For example, q ≈ 280 could be large enough for certain applications in terms of security √ requirements, but computations of order q ≈ 240 may still be too time consuming to be carried out on a regular basis by the user in practice. With a pre-computation aided DLP solving algorithm, the cost of solving each DLP could be reduced to something much more comfortable. For example, the possibility of solving each DLP with computations of 230 order, after a one-time pre-computation phase of 250 order, would make the RSA setting trapdoor discrete logarithm group much more attractive. Note that a pre-computation aided DLP solving algorithm will only reduce the online time complexity of solving DLPs and that the cost of pre-computation is larger than the online cost per DLP. The pre-computation aided DLP solving algorithms are mainly useful in situations, such as in relation to the trapdoor discrete logarithm groups, where the DLP in the group can already be solved at least once, possibly with resources larger than those available on the intended DLP solving platform, and where multiple DLPs in the same group need to be solved. Existing works. Let us review some of the previous developments related to this work. Experts of the field would currently point to the Pollard’s rho algorithm [33] as the most practical method for solving generic DLPs. To be more precise, one would refer to the variant that utilizes the r-adding walk iteration function, as was done by [41], to create chains of group elements and which relies on the distinguished point (DP) technique, as was done by [30], to detect collisions. When multiple DLPs in the same group need to be solved, computations done for any previously solved DLPs can be used to reduce the effort of solving subsequent DLPs [8,39]. In such a setting, one would be interested in the computational complexity of solving a given number of DLPs [21]. Looking at this sequential approach to solving multiple DLPs from a slightly different point of view, one can interpret the solving of all preceding DLPs as a pre-computation stage for the final DLP. From this viewpoint, it becomes more natural to state the expected cost of solving a DLP as a function of the number of iterations computed during all previous DLP solving attempts [14]. Now, note that the use of the previously computed data in solving the final DLP does not require the data to have been obtained through the solving of DLPs provided by a third party. In fact, as long as the same iterated walks are computed, they need not even be associated with the successful solving of DLPs. In the remainder of this paper, we will refer to the DLP solving algorithm we have been discussing (i.e., a Pollard’s rho variant with DP collision detection), understood to be an algorithm consisting of a clearly separate pre-computation phase, in which a table of useful information is prepared, and an online phase, in which a non-determined number of independent DLPs are to be solved without further contributing to the table, as the distinguished point method or the DP method. The accurate pre-computation and online time complexities of the DP method have been expressed [22] as functions of the implementation setup parameters. The cited work also discussed the relationship between the storage space size required to record the pre-computed data and the online time complexity of the algorithm, and a very similar discussion has also appeared [4] in a completely different context. A slightly modified version of the DP method was introduced more recently in [3]. Taking the initials of its authors, we will refer to this algorithm as the BL method throughout this paper. Their idea was to be selective in delivering the pre-computed data to the online phase. By choosing to store only the more useful data after producing a sufficiently larger pool of pre-computed data, one can make a more efficient use of the storage space, hence reduce the online time corresponding to the same storage space and increase the online efficiency. However, this advantage in online efficiency comes at a higher pre-computation cost. The authors of the current paper have recently discussed [15] the possibility of adopting the pre-computation matrix structures of various time memory tradeoff algorithms (unrelated to DLP solving) to create better pre-computation aided DLP solving algorithms. Although none of the structures considered there lead to an improved pre-computation aided DLP solving algorithm that is practically meaningful, one of the conclusions made there was that an attempt at combining the pre-computation matrix structures of the classical Hellman tradeoff and the DP tradeoff in some manner could lead to a better pre-computation aided DLP solving algorithm, and this paper introduces precisely such a method. Please cite this article as: J. Hong and H. Lee, Solving discrete logarithm problems faster with the aid of pre-computation, Discrete Applied Mathematics (2019), https://doi.org/10.1016/j.dam.2019.03.023.
J. Hong and H. Lee / Discrete Applied Mathematics xxx (xxxx) xxx
3
Organization. In Section 2, we fix the basic setting to be used in the rest of the paper. Our new algorithm is introduced in Section 3 and its execution complexities are fully analyzed in Section 4. This is followed by a section containing experimental results that support the correctness of our theoretical analyses. In Section 6, the performance of our method is compared with those of the DP and BL methods, and the paper is concluded in Section 7. The appendices contain some supporting material. In particular, a quick review of the Pollard’s rho algorithm and the DP method are given by the first two appendices, and Appendix D illustrates how the theoretical results obtained in this paper can be used to choose parameters for our algorithm that are most appropriate for a given situation. 2. Preliminary settings Let us fix the notation and basic setting to be used throughout this paper. In the remainder of this paper, unless explicitly stated otherwise, any reference to a DLP algorithm is to be understood as referring to a generic method for solving DLPs with the aid of pre-computation. This work will introduce the fuzzy Hellman method in its non-perfect and perfect versions. Later in this work, we will also be discussing the existing DP and the BL methods. Any reader that is not fully comfortable with the basic theory of generic DLP solving algorithms should refer to Appendices A and B before reading any further. A review of the Pollard’s rho algorithm and its variants, in the classical form that does not involve pre-computation, is given in Appendix A. The DP method, in the form that involves separate pre-computation and online phases, is explained in Appendix B. Some care must be given to the choice of the iteration function when working with pre-computation aided DLP solving methods, and this is explained there also. Throughout this work, we assume that a cyclic group G = ⟨g⟩ in which DLPs need to be solved has been fixed, along with a generator g for the group and an encoding scheme for its elements. The size of G will always be taken to be q, regardless of the DLP algorithm being considered. All DLP methods considered in this work require the use of DPs, and we will always assume that the distinguishing property has been defined so that a randomly chosen element of the cyclic group has probability 1t of being a DP. When discussing the DP or fuzzy Hellman methods, the number of starting points used in creating the pre-computation table will be set to m. Since duplicates among the ending points are removed before they are written to the pre-computation table, the number of entries in the pre-computation table is likely to be smaller than m. When dealing with the DP method, we assume that the parameters m and t have been chosen to satisfy the relation mt 2 ≈ q, which is widely referred to as the matrix stopping rule. We assume that a common iteration function F : G → G is used by all algorithms considered in this paper. We will make frequent use of one notion that is not yet a standard terminology in discussions of the DLP methods. The collection of all chains that were generated while creating a pre-computation table, or the gathering of all group elements that appeared in these chains, together with imaginary arrows between them describing the iterated walks, will be referred to as a pre-computation matrix. An appropriate gathering of DP chains created during pre-computation would similarly be referred to as a DP matrix. The reader is cautioned to distinguish a pre-computation matrix from a pre-computation table, the latter being a collection of just the chain ending points, together with their corresponding discrete logarithm values. In comparing the performances of the various DLP methods, we will be interested in the online time complexity T , the storage complexity M, and the pre-computation time complexity P of each method. We clarify that all complexities considered in this work are the average case complexities rather than the worst case complexities. It is easy to argue that the complexities of the DP method satisfy the two approximate tradeoff curves P · T ≈ q and T 2 · M ≈ q,
(1)
and we will later see how the same may be said of the other DLP methods discussed in this paper. In particular, as was briefly mentioned in the introduction section, the effort of solving each DLP in a group of order q ≈ 280 can be reduced from 240 to T ≈ 230 , after a pre-computation effort of P ≈ 250 , using a table of size M ≈ 220 . Even though this reduction in online time is made possible by all DLP methods considered in this paper, our interest lies in the constant factors hidden behind the approximations. This is because even small differences in the constant factors are of great practical importance when dealing with large groups, where one is forced to utilize pre-computation resources as efficiently as possible. 3. Fuzzy Hellman method Two new algorithms for solving DLPs with the aid of pre-computation are introduced in this section. We will refer to them as the non-perfect and perfect versions of the fuzzy Hellman method. 3.1. Overall structure and parameters Our two new algorithms share much in common with the existing DP method. All three methods use an iteration function to generate multiple pre-computation chains and store their terminal points together with the corresponding discrete logarithm values as the pre-computation table. They all generate online chain(s) until a merge of the online chain with a pre-computation chain produces a solution to the given DLP instance. The largest difference between the three algorithms is in how their pre-computation matrices are structured. Please cite this article as: J. Hong and H. Lee, Solving discrete logarithm problems faster with the aid of pre-computation, Discrete Applied Mathematics (2019), https://doi.org/10.1016/j.dam.2019.03.023.
4
J. Hong and H. Lee / Discrete Applied Mathematics xxx (xxxx) xxx
There is a relatively recent time memory (and data) tradeoff algorithm referred to as the fuzzy rainbow [1,2] tradeoff, which is not as widely known as the classical Hellman [12] or rainbow [29] time memory tradeoff algorithms. Its precomputation matrix may be described roughly as what one obtains by replacing each iteration within a rainbow matrix with a DP chain. The pre-computation matrix structure of the new DLP algorithms we will be introducing below may roughly be described as what one obtains by replacing each iteration within a classical Hellman matrix with a DP chain, and this is the source of the name fuzzy Hellman. Note that, the use of such a construction would seem counter-intuitive to those working on time memory tradeoffs, as one would instinctively expect an increase of chain merge possibility. Both the non-perfect and perfect fuzzy Hellman algorithms require the choice of positive integer parameters m, t, and s satisfying the matrix stopping rule mt 2 s2 ≈ q. For now, the reader may take the unfamiliar parameter s to be a small integer, such as 5 or 10. In short, s-many DP matrices that are serially connected to each other are generated during the pre-computation phase. Although an inaccurate description, we might state that the non-perfect version removes duplicates among the DPs found within each separate DP matrix and that the perfect version ensures that no duplicates are found among the DPs belonging to the collection of all s DP matrices. The terms non-perfect and perfect are being adopted from the field of time memory tradeoffs, where they are used to reflect whether duplicates have been removed from the pre-computation matrices in certain manners. 3.2. Details of the algorithms To begin the pre-computation phase of the non-perfect fuzzy Hellman method, one first chooses m elements of the cyclic group G at random. The set of these points will be denoted by SP1 . After generating a usual DP matrix from the starting points SP1 , through a typical procedure similar to Algorithm 3 of Appendix B, the ending points (with any duplicates among them removed) are gathered into the set EP1 . These distinct DPs are then taken to be the starting points SP2 = EP1 for another DP matrix generation, and the (distinct) ending points of the second DP matrix are gathered into EP2 . This process is repeated s times by iteratively setting SPi+1 = EPi , until one arrives at the set EPs . The terminal points EPs , each paired with its discrete logarithm value, are recorded as the pre-computation table. Algorithm 1: Pre-computation phase of the non-perfect fuzzy Hellman DLP solving method Randomly choose m integers 0 ≤ c1 , c2 , . . . , cm < q; EP0 ← {(gci , ci )}m i=1 ; for i = 1, . . . , s do SPi ← EPi−1 ; EPi ← empty list; for each (y, c) ∈ SPi do repeat c ← c + αι(y) (mod q); y ← Fr (y) = ymι(y) ; until y is a DP; Add (y, c) to EPi ; end Remove EPi of any duplicates; end Sort EPs with respect to the y-component; Return EPs as pre-computation table; The pre-computation process that has just been described is formalized as Algorithm 1 in a slightly specialized form. Namely, to add concreteness, we chose to restrict the iteration function to the r-adding walk iteration function Fr : G → G. The symbols mi = gαi (i = 1, . . . , r) are used to denote the multipliers and ι : G → {1, . . . , r } represents the index function for the r-adding walk. If any part of this notation description is unclear to the reader, she should reference Appendix B at this point. Elements of SPi and EPi will be referred to as starting and ending points, respectively. To reduce confusion, elements of SP1 and EPs will be referred to as the initial and terminal nodes, respectively, when we wish to emphasize their status as the true extremes of the complete concatenation of the s-many DP chain segments. The ith DP matrix, i.e., the DP matrix generated from SPi and ending with EPi , will be denoted by DMi . Recall that one visualizes DMi as a collection of chains with arrows between nodes indicating the action of Fr on them. As the DP chains are of different lengths, their collection is not rectangular as is with a classical Hellman pre-computation matrix. Furthermore, the pre-computation DP chains often merge into each other, and we visualize such a situation as two (or more) chains of arrows starting out as being parallel and then merging at some point into one sequence of arrows. Although such a collection of chains cannot be seen as taking rectangular shape, we shall still use the term matrix, as is being done in the field of time memory tradeoffs. In keeping with our mental visualization of a DP matrix, we will treat each DMi as having no duplicates. Let us next describe the pre-computation phase for the perfect fuzzy Hellman method. As before, we first take SP1 , a set of m random elements, to be the initial starting points. The distinct ending points of the usual DP matrix, generated Please cite this article as: J. Hong and H. Lee, Solving discrete logarithm problems faster with the aid of pre-computation, Discrete Applied Mathematics (2019), https://doi.org/10.1016/j.dam.2019.03.023.
J. Hong and H. Lee / Discrete Applied Mathematics xxx (xxxx) xxx
5
Algorithm 2: Pre-computation phase of the perfect fuzzy Hellman DLP solving method Randomly choose m integers 0 ≤ c1 , c2 , . . . , cm < q; EP0 ← {(gci , ci )}m i=1 ; for i = 1, . . . , s do SPi ← EPi−1 ;
˜ i ← empty list; EP for each (y, c) ∈ SPi do repeat c ← c + αι(y) (mod q); y ← Fr (y) = ymι(y) ; until y is a DP;
˜ i; Add (y, c) to EP end ˜ i of any duplicates; Remove EP
( ) ˜ i \ SP1 ∪ · · · ∪ SPi ; EPi ← EP end Sort EPs with respect to the y-component; Return EPs as pre-computation table;
˜ 1 , and this is further perfectized to EP1 = EP ˜ 1 \ SP1 , by removing any from SP1 , are temporarily collected as the set EP existing initial nodes found among them. The second DP matrix is generated from SP2 = EP1 . Iteratively, a usual DP ˜ i , and the smaller set matrix is generated from SPi , its distinct ending points are temporarily collected as EP
( ) ˜ i \ SP1 ∪ · · · ∪ SPi , SPi+1 = EPi = EP
(2)
removed of any previous starting points, is taken to be the next set of starting points. Information concerning the terminal nodes EPs is recorded as the pre-computation table. A summary of this procedure, specialized to r-adding walk iterating function case, is given as Algorithm 2. Note that, unlike the non-perfect case, one must be slightly careful in referring to the ith DP matrix in the perfect case, as there are at least two possible definitions. The obvious definition that should first come to mind is the collection of ˜ i ). A more meaningful DP matrix for the perfect case might be the smaller all chains starting from SPi (and ending in EP collection DMi of all chains ending in EPi (among those chains that started from some element of SPi ). Note that, if DMi ∩ DMj is non-empty for some i < j, then the common node leads to a common ending point and implies that EPi ∩ EPj is nonempty, so that we have the contradiction of EPj containing an element of SPi+1 with i + 1 ≤ j. Hence, the extra pruning of chains that are to be passed onto the next DP matrix done by the perfect version ensures that DMi ∩ DMj = ∅ for i ̸ = j. This is in contrast with the non-perfect case, where there is no guarantee that DMi ∩ DMj is empty. In other words, the extra step (2) taken by the perfect version may be interpreted as the removal of inter-DP-matrix duplicates. Despite what we have explained in this paragraph, neither of the two DP matrix definitions are useful for our complexity analysis, and the symbol DMi will only be used again in the next subsection. The online phases for the non-perfect and perfect fuzzy Hellman methods are identical to that of the existing DP method, except in one small detail. We specify that, when an online DP chain segment fails to solve the given DLP instance, the next online DP chain segment is to be started from the most recently obtained ending point. That is, one does not start the next online DP chain segment from a newly randomized starting point. Making the DP chain segments independent from each other defeats the purpose of the pre-computation matrix design and invalidates the complexity analyses to be given in the next section. Algorithm 4 of Appendix B is a summary of the online phase specialized to the r-adding walk case. The appropriate pre-computation table (containing either EPs or EPs ) should be used in place of the pre-computation table EP appearing there, and Option-(b) should be taken at the bottom part there. 3.3. Initial observations The perfect fuzzy Hellman method requires a record of accumulated starting points SP1 ∪ · · · ∪ SPi to be maintained for the duration of the pre-computation phase, and this is likely to grow to a size of about s times that of the final precomputation table. However, since the parameters will be chosen so that the pre-computation table fits on the machine that is to handle the online phase, the requirement for a larger temporary storage on the (usually larger) pre-computation machine should not bring about any practical difficulties. Furthermore, we will later see that s itself need not be large. ˜ i carried out during the perfect case pre-computation phase requires The removal of known starting points from EP computational efforts, such as sorting or hashing and table searching. However, the cost of these activities would be very Please cite this article as: J. Hong and H. Lee, Solving discrete logarithm problems faster with the aid of pre-computation, Discrete Applied Mathematics (2019), https://doi.org/10.1016/j.dam.2019.03.023.
6
J. Hong and H. Lee / Discrete Applied Mathematics xxx (xxxx) xxx
small in comparison to that required for the generation of all chains. Hence, the auxiliary cost of perfectizing the fuzzy Hellman matrix will be ignored during our analysis of the algorithm. Let us briefly present extremely rough complexity analyses of the original DP method and the fuzzy Hellman method. The parameter choice constraint mt 2 ≈ q for the DP method and the birthday paradox imply that the DP matrix contains Θ (mt) distinct entries and that the DP method pre-computation table is of MDP = Θ (m) size. The same ingredients also imply that an online DP chain of expected length t has a reasonable chance of merging into the DP matrix. Hence, one needs function iterations of TDP = Θ (t) order to solve a DLP instance with the DP method. The argument presented so far is a well known very basic one in the field of time memory tradeoff, and the same argument is also applicable to the fuzzy Hellman situation. The constraint mt 2 s2 ≈ q and the birthday paradox imply that the fuzzy Hellman pre-computation table is of MFH = Θ (m) size and that each DLP instance is expect to require TFH = Θ (ts) function iterations, i.e., the creation of Θ (s) online DP chain segments. This shows that, if a DP method parameter set (mDP , tDP ) and a fuzzy Hellman method parameter set (mFH , tFH , sFH ) satisfy the relations mDP ≈ mFH and tDP ≈ tFH sFH , then the two algorithms should behave comparably in terms of both storage size M and online execution time T . The above rough analysis does not preclude the possibility of still discovering small, but meaningful, performance differences. In fact, we can already state three high-level differences between the DP and fuzzy Hellman approaches. The first is that the fuzzy Hellman method requires about s times more lookups to the pre-computation table than the DP method. This is clearly a disadvantage of the fuzzy Hellman method, and how large an s can be used without the table lookups becoming the bottleneck will depend highly on the implementation platform and parameter choices. However, we will later see that the use of even small s values, such as s = 2, provides significant advantages. Unless the implementation environment is such that the table lookups already account for a significant portion of the online DLP solving time taken by the original DP method, the slightly higher frequency of table lookups required by the fuzzy Hellman method should have negligible effect on performance. The second difference we note is that the fuzzy Hellman method allows for a more fine-grained termination of the ′ online phase. Let us fix a positive integer t ′ and consider parameters tDP = t ′ and tFH = ts for the DP and fuzzy Hellman methods, respectively. For the sake of simplicity, let us assume that the length of every DP chain is exactly the length we ′ ′ expect as their average (i.e., t ′ and ts ) and that a t ′ -length DP method DP chain and a series of s-many ts -length fuzzy Hellman method DP chains have equal probability of merging into their respective pre-computation matrices. Now, with the fuzzy Hellman method, there is the possibility for the online phase to terminate before the online chain reaches its sth DP with an online chain of length smaller than t ′ , but the first chance for the online phase of the DP method to terminate appears only when the online chain reaches length t ′ . Furthermore, if the online phase does not terminate by the time the online chain reaches length t ′ , the next chance for termination of the fuzzy Hellman method is at length s+s 1 × t ′ , where as the same for the DP method is at length 2t ′ . For both the DP and fuzzy Hellman methods, the online phase can only be terminated when the online chain reaches the end of a DP chain segment, and the shorter segments of the fuzzy Hellman method allow for a more timely exits from the online phase. One could hope for this to translate to a smaller online time complexity for the fuzzy Hellman method. The final difference we note concerns the pre-computation complexity P. In the fuzzy Hellman method, iterations for some of the pre-computation chains are stopped (or merged into one) before the chain reaches the terminal nodes, with the perfect version being more aggressive in this respect than the non-perfect version. This could imply that a smaller precomputation effort is required by the fuzzy Hellman method than the DP method to achieve comparable online tradeoff efficiencies. Let us next comment on a difference between the two versions of the fuzzy Hellman method. In the non-perfect case, we have no guarantee that two DP sub-matrices DMi and DMj contain no common elements. Hence, it is possible for an online chain to merge into two different pre-computation DP matrices DMi and DMj (i < j) simultaneously. Such a merge would be detected and lead to the DLP solution when the merge into DMj , which is closer to the terminal points, is further iterated and reaches the terminal points. However, due to the additional pruning of pre-computation chains, the corresponding perfect fuzzy Hellman matrix (if such an object could be defined properly) would be missing the merging pre-computation chain that existed in DMj . In some sense, one might say that the online chain still merges into DMi (or DMi ), but no longer into DMj . The detection of this merge would be delayed until the merge into DMi (or DMi ), which is further away from the terminal nodes than DMj , reaches the terminal nodes. Thus, the perfect version is expected to be slower than the non-perfect version in such a case. At this point, one cannot be sure as to whether the longer online time of the perfect version can be compensated for by its smaller pre-computation and storage requirements. The final remark we make in this section concerns loops. As was already explained, the length of a full online chain for the fuzzy √ Hellman method is expected to be of T = Θ (ts) order, and the condition mt 2 s2 ≈ q forces this to be much smaller than q for any reasonable set of parameters. Hence, it is rare for an online chain to loop onto itself, and such possibilities will be ignored during our analysis. However, in practice, looping online chains must be detected to prevent the program from falling into an infinite loop, and a careful implementation would extract the DLP solution from even the self collision of the online chain. 4. Complexity analysis The average case online time, storage size, and pre-computation time complexities T , M, and P for the DP method and the two fuzzy Hellman methods will be presented in this section. Our theoretical analyses of the DLP methods will Please cite this article as: J. Hong and H. Lee, Solving discrete logarithm problems faster with the aid of pre-computation, Discrete Applied Mathematics (2019), https://doi.org/10.1016/j.dam.2019.03.023.
J. Hong and H. Lee / Discrete Applied Mathematics xxx (xxxx) xxx
7
always treat ( ) the iteration function as a random function, as is typically done in such analyses. Accurate approximations of 1 + O 1t order multiplicative factor will be silently written as equalities. In the remainder of this paper, we will be using the shorthand notation
√
1 + 2□
2□
Φ( □ ) = 1+
√
=
1 + 2□
mt 2 q
−1 and
mt 2 q
mt 2 q
Φ+ ( □ ) = Φ ( □ ) + 1
(3)
frequently. The two symbols Φ and Φ+ should be treated as text-expansion macros with every right-hand side box mechanically replaced with whatever is placed within the left-hand side parentheses. The symbols m, t, and q were fixed in Section 2 to represent certain values, and these same symbols are being used in the above. In particular, one should not replace m appearing in the above with what could be seen as ‘‘the number of starting points’’ where these symbols are used (typically the sizes of SPi and SPi ). We will write Φ k = Φ ◦ · · · ◦ Φ to denote the composition of k-many Φ ’s. For example, we have
√ Φ2
( ) 1
3
( ( )) 1 =Φ Φ =
1 + 2Φ
( 1 ) mt 2 3
mt 2
3
q
√ √
2 1+
−1 =
q
2 mt 2 3 q mt 2
−1−1 .
(4)
q
k The symbol Φ+ should be interpreted similarly. Although we are introducing the symbols Φ and Φ+ to use them strictly as shorthand notation, readers that must have some meaning attached to them will find Lemma 2 given below helpful. In short, a DP matrix created from γ m distinct starting points is expected to have Φ (γ )m distinct ending points.
4.1. DP method The three complexities associated with the original DP method follow easily from existing results concerning the distinguished point time memory tradeoff algorithm. As the two lemmas recalled in this subsection will also be used later in the situation where the number of starting points is not necessarily m, we will state these with the number of ¯ starting points written as m. The online time complexity of the DP method is a direct consequence of the following known lemma. Since this result is being taken from a manuscript that is unlikely to appear as a formal publication, we provide an independent proof in Appendix C.
¯ distinct starting points, under a distinguishing property of Lemma 1 ([22, Theorem 1]). Consider a DP matrix, created from m ¯ 2 = O(q), the probabilities for a randomly created DP chain to merge into and not to merge into this probability 1t . When mt DP matrix are Φ
( m¯ ) mt 2 m
1+Φ
q
( m¯ ) mt 2 m
and
q
1 1+Φ
( m¯ ) mt 2 , m
q
respectively. Since the inverse of the merge probability is the expected number of online DP chains that need to be generated until a merge occurs, as was previously given by [22, Eq. (6)], the online time complexity of the DP method (running with parameters m and t) must be
( TDP =
1+
)
1
Φ (1) mtq
2
×t
(5)
applications of the iteration function. To discuss the storage complexity, we recall the following result, which is almost a direct consequence of [16, Prop. 10].
¯ 2 = O(q), (a DP ¯ distinct starting points, under a Lemma 2 ([23, Lemma 2], [18, Eq. (7)]). When mt ) matrix, created from m ¯ 1 distinguishing property of probability t , is expected to have Φ m × m distinct ending points. m In other words, the storage complexity of the DP method is MDP = Φ (1) × m
(6)
table entries, where each entry consists of a group element and the corresponding discrete logarithm value. Finally, the pre-computation complexity of the DP method may trivially be claimed to be PDP = mt
(7)
applications of the iteration function. Please cite this article as: J. Hong and H. Lee, Solving discrete logarithm problems faster with the aid of pre-computation, Discrete Applied Mathematics (2019), https://doi.org/10.1016/j.dam.2019.03.023.
8
J. Hong and H. Lee / Discrete Applied Mathematics xxx (xxxx) xxx
4.2. Non-perfect fuzzy Hellman The non-perfect matrix fuzzy Hellman method is treated next. Recall that we have no guarantee of two DP matrices DMi and DMj (i < j) containing no elements in common. The fuzzy Hellman matrix, when viewed as a serial concatenation of DP matrices, is quite difficult to handle with random function arguments, because the iterations appearing within an earlier DP matrix DMi could determine the walks to be taken by some chain in a later DP matrix DMj . However, one can obtain a simpler manageable view by mentally cutting the serial connections between DP matrices and aligning the starting points of different DP matrices into a single column. The proofs given below are easier to understand with this re-structuring in mind. The online time complexity of the fuzzy Hellman method is related to the probability of chain merges, and this is connected to the number of starting points by Lemma 1. Our analysis starts with a claim concerning the effective number of starting points. k Lemma 3. The gathering DM1 ∪ · · · ∪ DMk of the first k DP matrices is expected to contain Φ+ (0) × m distinct starting points.
Proof. for a randomly chosen element of the group G to belong to the set SP2 ∪ · · · ∪ SPj+1 is ( ) ( mj )The probability of O q = O ts12 order, which is very small for any reasonable choice of parameters. Since the initial points SP1 were chosen at random from the group, the fraction of these starting points that also appear among SP2 ∪ · · · ∪ SPj+1 will be negligible. In other words, the intersection SP1 ∩ (SP2 ∪ · · · ∪ SPj+1 ) is likely to be only a negligible part of SP1 . 1 Let us temporarily use aspi,j = m |SPi ∪ SPi+1 ∪ · · · ∪ SPj | to denote the number of distinct accumulated starting points, counted in multiples of m. Combining the above discussion with the observation SP2 ∪ · · · ∪ SPj+1 = EP1 ∪ · · · ∪ EPj and Lemma 2, we can write asp1,j+1 = 1 + asp2,j+1 = 1 + Φ (asp1,j ) = Φ+ (asp1,j ). k Hence, starting from asp1,1 = 1 = Φ+ (0), we can apply this relation iteratively to claim asp1,k = Φ+ (0). □
This lemma will be used later in analyzing the perfect case. Analysis of the non-perfect case requires the following similar claim. k Lemma 4. The gathering DMs−k+1 ∪ · · · ∪ DMs of the last k DP matrices is expected to contain Φ s−k Φ+ (0) × m distinct starting points.
(
)
Proof. Let us adopt the notation aspi,j from the previous proof. Applying Lemma 2 to the observation SPi+1 ∪· · ·∪ SPj+1 = EPi ∪ · · · ∪ EPj , we can claim the relation aspi+1,j+1 = Φ (aspi,j ). Since we know from Lemma 3 that Φ+k (0) = asp1,k , we k can apply the previous relation s − k times to state that asps−k+1,s = Φ s−k (Φ+ (0)). □ The online time complexity of the non-perfect fuzzy Hellman method is a corollary to this lemma. Theorem 5.
TNFH =
The non-perfect matrix version of the fuzzy Hellman algorithm is expected to require
⎧ s−1 i ∑ ∏ ⎪ 1 ⎪ ⎪ ⎪ ( k ) mt 2 ⎪ ⎨ 1 + Φ s−k+1 Φ+ (0)
⎫ ⎪ ⎪ ⎪ ⎪ ⎪ ⎬
s−1 ⎪ ∏ ⎪ 1 1 ⎪ ⎪ + ( ) ( k ) mt 2 ⎪ 2 ⎩ Φ Φ s (0) mt s − k + 1 1+Φ Φ+ (0) q + q k=1
⎪ ⎪ ⎪ ⎪ ⎪ ⎭
i=0 k=1
q
×t
applications of the iteration function in solving each DLP instance. Proof. The first DP chain segment of the online chain is always generated. (The only possible exception is when the( DLP ) target is one of the pre-computation table entries, which can be ignored, as it happens with negligible probability Θ mq .) The second DP chain segment is generated if and only if the first segment did not merge into DMs . The third segment is generated if and only if the first segment did not merge into DMs−1 ∪ DMs and the second segment did not merge into DMs . Similar statements may be made for subsequent DP chain segments. The probabilities for each of the events described above may be computed by combining Lemmas 1 and 4. More specifically, for each 0 ≤ i ≤ s, the probability for the (i + 1)th DP chain segment of the online chain to be generated is i ∏ k=1
1 1+Φ
(
Φ s−k
( k )) mt 2 . Φ+ (0) q
The i ≥ s cases require only a little more attention. The (i + 1)th DP chain segment of the online chain is generated if and only if each of the first i − s + 1 segments did not merge into DM1 ∪ · · · ∪ DMs , the (i − s + 2)th segment did not merge Please cite this article as: J. Hong and H. Lee, Solving discrete logarithm problems faster with the aid of pre-computation, Discrete Applied Mathematics (2019), https://doi.org/10.1016/j.dam.2019.03.023.
J. Hong and H. Lee / Discrete Applied Mathematics xxx (xxxx) xxx
9
into DM2 ∪ · · · ∪ DMs , the (i − s + 3)th segment did not merge into DM3 ∪ · · · ∪ DMs , . . . , and the ith segment did not merge into DMs . The probability for such an event to occur is
(
)i−s
1
( s ) mt 2 (0) q 1 + Φ Φ+
·
s ∏ k=1
1
( ( k )) mt 2 . (0) 1 + Φ Φ s−k Φ+ q
The claimed online time complexity is the sum of the above mentioned probabilities multiplied by t, the expected length of each online DP chain segment. □ It is clear from the iterative use of Lemma 2 that the storage requirement of the non-perfect fuzzy Hellman method, running with parameters m, t, and s, is MNFH = Φ s (1) × m
(8)
pre-computation table entries, and that its cost of pre-computation is PNFH =
s−1 ∑
Φ k (1) × mt
(9)
k=0
applications of the iteration function. 4.3. Perfect fuzzy Hellman Arguments of the previous subsection that lead to the time complexity of the non-perfect fuzzy Hellman method were centered on the probabilities for the online chain to merge into each DP matrix. In treating this, since the nonperfectness of the pre-computation matrix makes it possible for the online chain to merge into multiple DP matrices simultaneously, we were careful to focus on the merge that was closest to the terminal points. However, once the position of effective merge of the online chain into the pre-computation matrix was fixed, further iterations of the chain were clearly foreseeable, and the number of additionally DP chain segments that needed to be traveled through before arriving at a terminal node was evident. The situation is quite different with the perfect fuzzy Hellman method. The perfectness of the pre-computation matrix, i.e., the discussion given below (2), makes it possible to create a meaningful definition of the position of merge between the online chain and the pre-computation matrix. However, the discussion given near the end of Section 3.3 shows that iterations of the online chain after it merges into the pre-computation matrix are not as predictable and traceable as with the non-perfect case. As successive DP chain segments that extend out from the position of merge are generated, it may happen that the ending point of one generated segment becomes one of the DP matrix ending points that were removed during the perfectization process described by (2). In such a case, the path followed by the online chain jumps backwards to a DP matrix that sits closer to the initial points and takes a longer route to one of the pre-computation matrix terminal ˜ j ∩ SPi with i ≤ j. If an online chain merges into the pre-computation matrix nodes. For example, suppose that x ∈ EP and its subsequent iterations eventually reach x, then the creations of (s − j)-many further DP chain segments will not produce an element of EPs , the pre-computation table, because x does not belong to SPj+1 . To trace the evolution of the merged online chain after it reaches x, since (what might be referred to as) the (j + 1)th DP matrix is not relevant in any way, one can only jump back to the ith DP matrix, which has x ∈ SPi as one of its starting points. In fact, this bouncing back operation could happen several times before the online chain reaches a terminal node and produces a solution to the DLP instance. Strictly logically, there is even the possibility for a closed loop to exist within the perfect pre-computation matrix, which could prevent an unfortunate online chain from ever reaching a terminal point, but we have already explained that such an event can be ignored during our complexity analysis. To deal with this unpredictable and complex situation, we define PrTmi (n), for each 1 ≤ i ≤ s and n ∈ Z≥0 , to be the probability for a series of DP chains that starts from an element of EPi to reach the terminal nodes after passing through exactly n DP chain segments. For example, the boundary values 0
for 1 ≤ i < s,
1
for i = s,
{ PrTmi (0) =
{ and
PrTms (n) =
1
for n = 0,
0
for n ≥ 1,
(10)
are direct consequences of the definition. It is also clear that PrTmi (n) = 0, whenever i + n < s, and one would intuitively expect the PrTmi (s − i) value to dominate all other PrTmi (n) values, for each fixed i. We will find a way to compute these termination probabilities iteratively, and then connect these values to the time complexity of the perfect fuzzy Hellman method. Our first lemma that prepares for this task should be intuitively reasonable, even though it is slightly tedious to prove. Lemma 6. Let us consider both the perfect and non-perfect fuzzy Hellman matrices generated from the same set of starting points SP1 = SP1 . With the notation extended to EP0 = EP0 = SP1 , we have
EPi = (EP0 ∪ · · · ∪ EPi ) \ (EP0 ∪ · · · ∪ EPi−1 ), for every 0 ≤ i ≤ s. Please cite this article as: J. Hong and H. Lee, Solving discrete logarithm problems faster with the aid of pre-computation, Discrete Applied Mathematics (2019), https://doi.org/10.1016/j.dam.2019.03.023.
10
J. Hong and H. Lee / Discrete Applied Mathematics xxx (xxxx) xxx
Proof. The claim is trivially true for i = 0. Let us take the i = 0, . . . , k cases of the claim as our induction hypotheses. Noting that the claim is equivalent to EPi = EPi \ (EP0 ∪ · · · ∪ EPi−1 ), we can easily verify that the induction hypotheses imply
EP0 ∪ · · · ∪ EPk = EP0 ∪ · · · ∪ EPk .
(11)
After interpreting this relation as SP1 ∪ · · · ∪ SPk+1 = SP1 ∪ · · · ∪ SPk+1 , one can generate DP chains from elements of both sides and collect their ending point DPs to claim
˜ 1 ∪ · · · ∪ EP ˜ k+1 = EP1 ∪ · · · ∪ EPk+1 . EP
(12)
Now, by definition and (11), we have
˜ k+1 \ (SP1 ∪ · · · ∪ SPk+1 ) = EP ˜ k+1 \ (EP0 ∪ · · · ∪ EPk ) EPk+1 = EP ˜ k+1 \ (EP0 ∪ · · · ∪ EPk ), = EP ˜ i ⊂ EPi allows us to extend this series of equalities to and the simple observation EP ˜ k ∪ EP ˜ k+1 ) \ (EP0 ∪ · · · ∪ EPk ). ˜ 1 ∪ · · · ∪ EP EPk+1 = (EP Finally, we can substitute (12) into the above to find
EPk+1 = (EP1 ∪ · · · ∪ EPk+1 ) \ (EP0 ∪ · · · ∪ EPk ), = (EP0 ∪ EP1 ∪ · · · ∪ EPk+1 ) \ (EP0 ∪ · · · ∪ EPk ), completing the induction step. □ The next lemma translates the above claim concerning sets into a claim concerning probabilities. Lemma 7.
The probability for a random DP chain to end in EPi is 1
1
( i−1 ) mt 2 − ( i ) mt 2 , 1 + Φ Φ+ (0) q 1 + Φ Φ+ (0) q for every 1 ≤ i ≤ s. Proof. Lemma 6 states that a DP chain will end in EPi if and only if it ends in EP0 ∪ · · · ∪ EPi , but does not end in EP0 ∪ · · · ∪ EPi−1 . ( ) The probability for a random DP chain to end in EP0 is mq = Θ t 21s2 . In comparison, a very rough analysis implies
)t )
that a random DP chain will end in EP1 ∪ · · · ∪ EPi with a probability of Θ 1 − 1 − imt = Θ si2 order. Since the q former probability is much smaller than the latter probability, we may safely ignore the EP0 part and (almost) claim that a random chain will end in EPi if and only if it ends in EP1 ∪ · · · ∪ EPi , but does not end in EP1 ∪ · · · ∪ EPi−1 . Since one of the two sets being considered contains the other, the probability in question can be stated as a difference ¯ to be used is provided by Lemma 3. The explicit probability is of probabilities provided by Lemma 1, where the correct m m
(
(
( )
( ) 2 ( ) 2 Φ Φ+i (0) mtq Φ Φ+i−1 (0) mtq ( i ) mt 2 − ( i−1 ) mt 2 , 1 + Φ Φ+ (0) q 1 + Φ Φ+ (0) q and this is equal to what is claimed.
□
As a corollary to this lemma, we can state the following relation that is satisfied by the termination probabilities. Proposition 8. For each 1 ≤ i < s, the probability PrTmi (n) for a string of DP chain segments that starts from EPi = SPi+1 to terminate at EPs after generating exactly n DP chain segments satisfies the relation PrTmi (n + 1) =
i ∑
(
1
) mt 2 −
k−1 1 + Φ Φ+ (0)
(
k=1
q
)
1 k 1 + Φ Φ+ (0)
(
) mt 2 q
( PrTmk (n) +
)
1 i 1 + Φ Φ+ (0)
(
) mt 2
PrTmi+1 (n).
q
Proof. A DP chain that starts from EPi has the possibility of merging into one of the DP matrices that are located between EPi and the initial points SP1 . The probabilities for such events are stated by Lemma 7, for each DP matrix, in a manner that deals with disjoint sets of events. If the DP chain that starts from EPi does not merge into any of the previous DP matrices, it must reach EPi+1 . Furthermore, by definition of the ending points EPk , the chain cannot simultaneously reach any of the EPk with k > i + 1. It now suffices to combine what has been discussed into one relation. Note that we have silently ignored the very small possibility of DP chain segments ending in EP0 = SP1 . This is justified by the discussion given within the proof of Lemma 7. □ Please cite this article as: J. Hong and H. Lee, Solving discrete logarithm problems faster with the aid of pre-computation, Discrete Applied Mathematics (2019), https://doi.org/10.1016/j.dam.2019.03.023.
J. Hong and H. Lee / Discrete Applied Mathematics xxx (xxxx) xxx
11
This proposition makes it possible, in practice, to obtain each PrTmi (n) value, for any given set of parameters, through the recursive computations that start from the boundary values (10). Even though these can then be used to compute the time complexity of the perfect fuzzy Hellman method, we wish to eliminate the need for iterative computations. To do this, we take a closer look at the expected number of DP chain segments that must be traversed before a random chain that starts from EPi terminates in EPs . More specifically, we wish to express SCi =
∑
n × PrTmi (n),
(13)
n
in a manner that does not involve any iterative computations, for each 1 ≤ i ≤ s. ∑∞The last term SCs is trivially zero. As for the other terms, one can combine Proposition 8 with the observation that n=0 PrTmi (n) = 1 to obtain the relation SCi =
i ∑
(
1 k−1 1 + Φ Φ+ (0)
(
k=1
) mt 2 − q
)
1
) mt 2
k 1 + Φ Φ+ (0)
(
( SCk +
)
1 i 1 + Φ Φ+ (0)
(
q
) mt 2
SCi+1 +1,
(14)
q
for each 1 ≤ i < s. When SCs = 0 is substituted, this becomes a system of s − 1 linear equations in the s − 1 unknowns SCi (1 ≤ i < s). The explicit solution to this linear system, which can be solved by hand, is SCi =
s−i s−j ( ∑ ∏
( k ) mt 2 1 + Φ Φ+ (0)
)
q
j=1 k=1
.
(15)
Writing down the time complexity of the perfect fuzzy Hellman method is now a matter of appropriately combining the various concepts we have discussed so far. Theorem 9. TPFH =
The perfect matrix version of the fuzzy Hellman algorithm is expected to require s ( ∏
1
( ) 2 Φ Φ+s (0) mtq
1 + Φ Φ+ (0) k
(
) mt 2
)
q
k=1
×t
applications of the iteration function in solving each DLP instance. Proof. A series of online DP chains will begin with a few, say i-many, segments that completely miss hitting the precomputation matrix. It will then merge into the pre-computation matrix and directly reach a well-defined specific EPj . Finally, it will travel through the pre-computation DP matrices, creating more DP chain segments, until a terminal node is reached. The probability for the first series of missing events to occur is given by a combination of Lemma 1 and Lemma 3. The probability for the merging event is given by Lemma 7. Finally, the work factor associated with such a situation is expected to be (i + 1 + SCj ) × t random walk iterations. The time complexity may hence be written as ∞ ∑ s ∑
(
)i (
1
) mt 2
s 1 + Φ Φ+ (0)
(
i=0 j=1
q
1 j−1
1 + Φ Φ+ (0)
(
) mt 2 − q
)
1 j
) mt 2
1 + Φ Φ+ (0)
(
(i + 1 + SCj ) × t .
q
This infinite sum simplifies to what is claimed. Knowing the relation SCi = SCi+1 +
i ( ∏
( j ) mt 2 1 + Φ Φ+ (0)
)
q
k=1
can be helpful in carrying out the simplification.
□
The remaining two complexities of interest are relatively easy to state. One can argue from Lemmas 3 and 6 that the storage requirement of the perfect fuzzy Hellman method, running with parameters m, t, and s, is s+1 s MPFH = Φ+ (0) − Φ+ (0) × m
(
)
(16)
pre-computation table entries. Since Lemma 6 implies that the disjoint accumulation of the starting points SP1 ∪ · · · SPk+1 = EP0 ∪ · · · ∪ EPk is in fact equal to the usual union EP0 ∪ · · · ∪ EPk = SP1 ∪ · · · ∪ SPk+1 , one can relate to Lemma 3 once more to claim that the cost of pre-computation is s PPFH = Φ+ (0) × mt
(17)
applications of the iteration function. Please cite this article as: J. Hong and H. Lee, Solving discrete logarithm problems faster with the aid of pre-computation, Discrete Applied Mathematics (2019), https://doi.org/10.1016/j.dam.2019.03.023.
12
J. Hong and H. Lee / Discrete Applied Mathematics xxx (xxxx) xxx
Table 1 Theoretical and experimental complexities for the non-perfect fuzzy Hellman method (P: pre-computation time; T : online time; M: storage size). m
t
s
mt 2 s2 q
11248
215
5
3.407
36350
14
2
5
213
110960
5
214
19150 169500 15668
2
12
2
16
5 5 2
PNFH mts
TNFH ts
MNFH m
Failures
Theory Test
0.8920 0.8929
0.9668 0.9738
0.7563 0.7587
0.0074%
2.752
Theory Test
0.9092 0.9102
1.053 1.061
0.7920 0.7945
0.0020%
2.100
Theory Test
0.9278 0.9290
1.185 1.195
0.8316 0.8338
0.0014%
1.450
Theory Test
0.9479 0.9486
1.423 1.435
0.8763 0.8785
0.0092%
0.802
Theory Test
0.9698 0.9704
2.011 2.028
0.9268 0.9279
0.0010%
3.037
Theory Test
0.8866 0.8881
1.207 1.216
0.6249 0.6283
0.0138%
0.9142 0.9161
1.412 1.424
0.7043 0.7072
0.0058%
0.9495 0.9504
1.963 1.982
0.8158 0.8178
0.0048%
41280
215
2
2.000
Theory Test
82550
214
2
1.000
Theory Test
Table 2 Theoretical and experimental complexities for the perfect fuzzy Hellman method (P: pre-computation time; T : online time; M: storage size). m
t
s
mt 2 s2 q
8940
212
30
1.523
25900 65800
11
2
210
17300
213
47600
12
2
30 30 10 10
PPFH mts
TPFH ts
MPFH m
Failures
Theory Test
0.8158 0.8182
1.632 1.648
0.5131 0.5187
0.0238%
1.103
Theory Test
0.8573 0.8592
1.797 1.811
0.6069 0.6112
0.0078%
0.701
Theory Test
0.9029 0.9041
2.241 2.261
0.7203 0.7236
0.0040%
1.310
Theory Test
0.8539 0.8557
1.739 1.751
0.5694 0.5743
0.0092%
0.901
Theory Test
0.8923 0.8937
2.009 2.025
0.6675 0.6712
0.0032%
5. Experimental verification The theoretical findings of the previous section are verified experimentally in this section. To fix the group on which to carry out the DLP experiments, we first generated random 48-bit primes p until the p−1 47-bit integer q = 2 turned out also to be a prime and 2 ∈ Fp belonged to the corresponding multiplicative subgroup of order q. The explicit primes we obtained were p = 177259890077927 and q = 88629945038963. All experiments described below, other than the ones which we later explicitly state as involving the AES block cipher, were carried out in the prime order q cyclic group ⟨2⟩ ≤ F× p , with the 64-adding walk iteration function. The fuzzy Hellman method was executed and the cost of pre-computation, the number of entries in the resulting pre-computation table, and the cost of solving randomly generated DLP instances with the pre-computation table were recorded. For each set of parameters m, t, and s, we generated 100 pre-computation tables, and attempted to solve 5000 randomly generated DLP instances with each pre-computation table. A fresh set of 64 multiplier exponents were chosen at random before the creation of each pre-computation table. Any pre-computation or online chain that did not produce a DP for 20 · t consecutive iterations was discarded. Any online chain that did not reach one of the pre-computation table entries within its first 50 · s DP chain segments was also discarded. The results of our tests are summarized in Tables 1 and 2 for the non-perfect and perfect fuzzy Hellman methods, respectively. The theoretical values in the tables were computed from Theorem 5, (8), (9), Theorem 9, (16), and (17). The wasted cost of generating the discarded chains is contained in the experimentally obtained pre-computation and online time complexities. Every discarded online chain is treated as a failure to solve the given DLP instance. The failure counts divided by 5 × 105 , the number of attempted DLP instances for each parameter set, are given in the rightmost columns of the tables. As should be expected, those parameter sets for which the theoretical online time complexities √ TNFH or TPFH are close to q exhibit relatively higher failure rates. Our test implementation simply discarded the infrequent online chains falling into loops, but a practical implementation would solve many of these failed DLP instances, at almost no extra cost, by uncovering the Pollard’s rho style self-collision of the online chain. Please cite this article as: J. Hong and H. Lee, Solving discrete logarithm problems faster with the aid of pre-computation, Discrete Applied Mathematics (2019), https://doi.org/10.1016/j.dam.2019.03.023.
J. Hong and H. Lee / Discrete Applied Mathematics xxx (xxxx) xxx
13
Table 3 Theoretical and experimental complexities of the perfect fuzzy Hellman method, executed with an iteration function devised from AES (P: pre-computation time; T : online time; M: storage size). m
t 12
8940
30
2
11
25900
mt 2 s2 q
s
30
2
PPFH mts
TPFH ts
MPFH m
Failures
1.523
Theory Test
0.8158 0.8154
1.632 1.647
0.5131 0.5125
0.0246%
1.103
Theory Test
0.8573 0.8572
1.797 1.806
0.6069 0.6066
0.0096%
0.9029 0.9030
2.241 2.243
0.7203 0.7203
0.0040%
65800
210
30
0.701
Theory Test
17300
213
10
1.310
Theory Test
0.8539 0.8541
1.739 1.740
0.5694 0.5700
0.0082%
47600
12
0.901
Theory Test
0.8923 0.8920
2.009 2.014
0.6675 0.6671
0.0052%
10
2
It is clear that our experimental data and theoretical predictions are in very good agreement. In fact, most of the rates of differences between theory and test from the two tables are below 1%, with the only exception being 0.5187−0.5131 = 1.091%, calculated for the data from the top right part of Table 2. Nevertheless, it can only be noticed 0.5131 that each experimental datum is always larger than the corresponding theoretical prediction. Thus, the differences, although very small, are more likely to be signs of true differences existing between theory and reality than a display of experimental noise. Recall that the behavior of the r-adding walk iteration function is known to be slightly different from that of a random function. To test if this was the reason behind the small differences, we ran another batch of experiments, with the 64-adding walk iteration function replaced by another function. The function we used was, in short, the key to ciphertext mapping, under a fixed plaintext, of the AES block cipher. Let us describe this function in slightly more detail. The function maps a non-negative integer smaller than q to another such integer. First, a 128-bit plaintext is randomly chosen and fixed. The choice of this plaintext may be seen as corresponding to the choice of r-adding walk multipliers. Given an integer input, one writes it in binary form and zero-extends it to a 128-bit AES key. Then, this key is used to encrypt the fixed plaintext. Finally, the lower 64 bits of the ciphertext are reduced modulo the 47-bit prime q to be taken as the output of the function. Results of tests carried out with this function are summarized in Table 3. The perfect fuzzy Hellman method was executed, except that the r-adding walk iteration function was replaced by the function constructed from AES. All other details of the experiment, such as the number of repetitions, were kept identical to the previous tests. The modified algorithm can no longer solve DLPs, but was deemed successful if the online chain reached a point in the pre-computation table. The matchings between the theoretical and experimental values in Table 3 are much tighter than those in Table 2. Furthermore, the experimental values are no longer consistently greater than the theoretical values. The non-random behavior of the r-adding walk iteration function was indeed the main reason behind the peculiar trends observed in Tables 1 and 2. The very small differences observed between the experimental data and our theoretical predictions are inevitable consequences of treating the r-adding walk iteration function as a random function. 6. Performance discussion Let us gather the results of our analyses to compare the performance of the fuzzy Hellman method with those of the original DP method and the BL method [3]. 6.1. Tradeoff curves We start by writing down two relations satisfied by the complexities P, T , and M, for each of the DLP methods. DP method. The two complexity claims (5) and (6) for the original DP method can be combined into the tradeoff curve 2 TDP · MDP
q
= Φ (1)
mt 2 q
( 1+
)2
1
Φ (1) mtq
2
.
Note that the right-hand side may be seen as a function of
(18) mt 2 , q
rather than that of m and t separately. This implies that
2
one can decide on a value of mtq , so that the right-hand side is held constant, and still tradeoff TDP and MDP against each other through further choices of either m or t. The above equation is not just a relation that happens to be satisfied by the complexities TDP and MDP , but a display of the TDP -MDP tradeoff possibilities of the DP method. Please cite this article as: J. Hong and H. Lee, Solving discrete logarithm problems faster with the aid of pre-computation, Discrete Applied Mathematics (2019), https://doi.org/10.1016/j.dam.2019.03.023.
14
J. Hong and H. Lee / Discrete Applied Mathematics xxx (xxxx) xxx
The combination of time complexities (5) and (7) gives the second tradeoff curve PDP · TDP q
=
mt 2 q
+
1
(19)
Φ (1)
for the DP method, which appeared previously in [22, Eq. (7)]. Once again, this shows that tradeoffs between PDP and TDP are possible. We remark that the work [22] could not present the tradeoff curve (18), because the work did not have access to the expression (6) for MDP . Non-perfect fuzzy Hellman method. The tradeoff curves for the non-perfect fuzzy Hellman method are direct consequences of Theorem 5, (8), and (9). 2 TNFH · MNFH
q
PNFH · TNFH q
= Φ (1) s
=
s−1 ∑
mt 2 q
Φ (1)
k=0
k
{ s−1 i ∑∏
1
) mt 2 +
k (0) 1 + Φ s−k+1 Φ+
(
i=0 k=1
mt 2 q
{ s−1 i ∑∏
1 k 1 + Φ s−k+1 Φ+ (0)
(
i=0 k=1
q
1
s−1 ∏
) 2 ( Φ Φ+s (0) mtq
k=1
) mt 2 + q
For each fixed s, the right-hand sides may be viewed as functions of
(
s−1 ∏
( ) 2 Φ Φ+s (0) mtq
k=1
.
) mt 2
k (0) 1 + Φ s−k+1 Φ+
1
mt 2 , q
}2
1
}
1 k 1 + Φ s−k+1 Φ+ (0)
(
(20)
q
) mt 2
. (21)
q
rather than that of separate variables m and t.
mt 2
Since the constraints of fixing s and q still leave a single degree of freedom concerning the parameters m and t, this freedom can be used to tradeoff PNFH , TNFH , and MNFH , against each other. Note that the s = 1 case of the non-perfect fuzzy Hellman method is precisely the DP method. The above tradeoff curves reduce to the previous DP method curves (18) and (19), when s = 1 is substituted. Perfect fuzzy Hellman method. The tradeoff curves for the perfect fuzzy Hellman method may be obtained from Theorem 9, (16), and (17). 2 TPFH · MPFH
q PPFH · TPFH q
) s ( ( k ) mt 2 2 Φ+s+1 (0) − Φ+s (0) ∏ = { ( . 1 + Φ Φ (0) )}2 2 + q Φ Φ+s (0) mtq k=1 =
(22)
) s ( ( k ) mt 2 Φ s (0) ∏ ( +s ) 1 + Φ Φ+ (0) . q Φ Φ+ (0) k=1
(23) 2
Once again, for each fixed s, the right-hand sides may be viewed as functions of mtq , and tradeoffs between PPFH , TPFH , and MPFH are possible. The perfect fuzzy Hellman method at s = 1 differs slightly from the DP method in that the (initial) starting points are removed from its (terminal and only) ending points. However, since ( the ) m starting points were chosen at random, the probability for each of the ending points to be removed is mq = Θ t 21s2 , which is small enough to be ignored. Hence, it is not surprising that the above reduces to (18) and (19), when s = 1 is substituted. BL method. The paper [3] that introduced the BL method did not provide complexity claims that are accurate enough for the purpose of comparing different algorithms, and it seems that obtaining such information will not be easy. So we do not have access to tradeoff curves for the BL method that are analogous to those we have made available for the DP and fuzzy Hellman methods. However, the series of experiments described in [3] show that it is reasonable to expect at least a T -M tradeoff of the 2 form TBL MBL ≈ const. The situation concerning the P-T tradeoff is less clear, but the way Table 4.1 of [3] was presented seems to suggest that the authors were expecting a tradeoff of the form PBL TBL ≈ const. Hence, we will assume that the BL method allows tradeoffs analogous to the other DLP methods. In any case, this assumption cannot work to the disadvantage of the BL method during our performance comparison. 6.2. Upper level tradeoffs Let us briefly return to the tradeoff curves for the DP method. It is easy to check that the right-hand side of (18) is 2 minimized at mtq = 1.5. Hence, to make optimal use of the online time and storage resources, one would wish to restrict oneself to m and t such that mt 2 = 1.5q, while working out an appropriate tradeoff between T and M. On the other 2 hand, it is easy to see that a smaller mtq returns a smaller right-hand side value of (19), and this is also something one would wish for, as it allows for the pre-computation effort to be more efficiently transferred to faster online DLP solving. Thus, we can only conclude that, due to the interdependence of the right-hand sides of (18) and (19), it is not possible to optimize both the TDP -MDP tradeoff and the PDP -TDP tradeoff simultaneously. Please cite this article as: J. Hong and H. Lee, Solving discrete logarithm problems faster with the aid of pre-computation, Discrete Applied Mathematics (2019), https://doi.org/10.1016/j.dam.2019.03.023.
J. Hong and H. Lee / Discrete Applied Mathematics xxx (xxxx) xxx
15
2
Fig. 1. The choice of tradeoff coefficient pairs made possible by the DP and fuzzy Hellman methods. x-axis: PT ; y-axis: T qM ; LHS box: non-perfect q fuzzy Hellman at s = 1, 2, 3, 4, 5; RHS box: perfect fuzzy Hellman at s = 1, 2, 5, 10, 30; Within each box, a lower curve corresponds to a larger s value. The DP method is the fuzzy Hellman method at s = 1 (dashed curves).
The observation that the right-hand sides of the two tradeoff curves are not independent applies also to the two fuzzy Hellman methods and we should expect the same from the BL method. This complicates the task of choosing parameters or defining the notion of an optimal set of parameters, since the relative importance of the T -M and P-T tradeoffs will be different for each application situation of a DLP method. For example, if an implementer has access to only very limited 2 amount of pre-computation resources, he or she would choose to use an mtq value for which the right-hand side of PT q 2
is small, even if it makes the right-hand side of T qM large. Then, a smaller pre-computation cost P can be invested in achieving a reasonable online time T , and the implementer would accept the use of a larger storage size M. More generally, one can decide on an appropriate upper level balance between the T -M tradeoff possibilities and the 2 P-T tradeoff possibilities by choosing a mtq value, after which, the tradeoffs between P, T , and M are governed by the two tradeoff curves, with fixed right-hand side values. We emphasize once more that, in many situations, optimizing just the T -M tradeoff curve will not be in the implementer’s best interest. This issue is further illustrated in Appendix D, where we show examples of how the results of our accurate analysis can be used to choose implementation parameters that are deemed most appropriate for a given situation. 6.3. DP versus fuzzy Hellman As discussed in the previous subsection, one would wish to decide on a suitable upper level balance between the efficiencies of the T -M and P-T tradeoffs. Fig. 1 provides the information required to make this balancing decision for the two fuzzy Hellman methods in a compact manner. It presents the tradeoff coefficient pairs that are made possible by the non-perfect and drawn by plotting the tradeoff coefficient pair ) perfect fuzzy Hellman methods. Each curve was ( mt 2 T 2M , as given by the formulas of Section 6.1, using , as a parameter, under a fixed s. coordinates PT q q q 2
The horizontal axes give the PT values and the vertical axes give the T qM values. The left-hand side box is for the q non-perfect fuzzy Hellman method, with the five curves corresponding to s values 1 though 5. The right-hand side box is for the perfect fuzzy Hellman method, with the five curves corresponding to s values 1, 2, 5, 10, and 30. In both boxes, the curve for a larger s value is the one that appears closer to the lower left corner. The two top dashed curves in the two boxes are identical to each other, with both corresponding to the choice of s = 1. In other words, the common curve presents the upper level tradeoff of the DP method. The single dot on each curve marks the curve’s lowest point, i.e., the point of optimal online efficiency. Parameters that correspond to the curve segments extending to the right past these dots should not be used, as they bring about lower online efficiency at higher pre-computation cost than the lowest point. However, parameters corresponding to curve points that appear to the left of each dot are meaningful, unless the cost of pre-computation can be ignored completely. Within each box of Fig. 1, a curve appearing closer to the lower left corner exhibits better overall performance capabilities, since the points on the curve correspond to better online efficiency at more efficient use of pre-computation. It is clear that both the non-perfect and perfect fuzzy Hellman methods are better than the DP method, even when a small s value is used. It is also clear from the curves that, even though a larger s value is always better, the additional advantage gained by each successively larger s value saturates quite quickly,1 and that there is no need to use a very large s value at the possible disadvantage of a higher table lookup frequency. 1 Some readers may find Appendix E interesting in relation to this matter. Please cite this article as: J. Hong and H. Lee, Solving discrete logarithm problems faster with the aid of pre-computation, Discrete Applied Mathematics (2019), https://doi.org/10.1016/j.dam.2019.03.023.
16
J. Hong and H. Lee / Discrete Applied Mathematics xxx (xxxx) xxx
2
Fig. 2. The tradeoff coefficient pairs of the fuzzy Hellman method in comparison to those of the BL method. x-axis: PT ; y-axis: T qM ; DP (dashed q line); non-perfect fuzzy Hellman at s = 5 (thin line); perfect fuzzy Hellman at s = 30 (thick line); BL with N = T (◦); BL with N = 2T (×); BL with N = 8T (⋆).
Comparing the non-perfect and perfect fuzzy Hellman methods against each other, one finds that the perfect method can achieve somewhat better online efficiencies at slightly lower pre-computation costs. However, a larger s value must be used to take advantage of this possibility, and since a larger s brings about a higher table lookup frequency, there could still be situations where the use of the non-perfect fuzzy Hellman method with a smaller s is more practical. 6.4. BL versus fuzzy Hellman As we are lacking theoretical treatment of the BL method, our comparison of the BL method against the fuzzy Hellman method will rely entirely on the data listed in Table 4.1 of [3]. We will treat each experimental data set consisting of a T 2 MBL
. (PBL , TBL , MBL ) triple as providing an instance of the tradeoff coefficient pair BLq BL , BLq The performances of the fuzzy Hellman method and the BL method can be compared through Fig. 2. The curve for the DP method and the two lowest curves from the two boxes of Fig. 1 have been copied over. The curves here may look different from those of Fig. 1, but this is only due to the change in scale of the horizontal axis. The experimental data points for the BL method, interpreted as discussed above, have also been plotted. For example, the coordinates (1.785, 4.41) for the highest ◦-mark were obtained from the entries 0.84506 and 2.13856 appearing in Table 4.1 of [3] through the calculations 0.85 · 2.1 = 1.785 and 2.12 = 4.41. The group of five data points indicated by the ◦-marks is for the BL method running under the N = T setting. The N = T setting reduces the BL method to the DP method, and, as expected, the ◦-marks are somewhat close to the dashed line, although the alignments are not perfect. One can verify through a closer examination of Table 4.1 from [3] that their pre-computation time data are smaller than the predictions made through a combination of (6) and (7), and that their online time data are larger than the predictions of (5). Since we know from experience that the r-adding walks are less likely to collide into one another than the true random walks, the non-random characteristics of r-adding walks may well be a meaningful contributor to the small discrepancy, although we cannot be sure. The group of five ×-marks indicates the data for the BL method running under the N = 2T setting. This is when T pre-computation table entries that seem more useful are carefully selected from a collection of 2T distinct ending point DPs. The five data points represent better online efficiency than the dashed curve for the original DP method. However, they are slightly weaker in performance to the non-perfect fuzzy Hellman method at s = 5 and visibly worse than the perfect fuzzy Hellman at s = 30. In fact, even the perfect fuzzy Hellman at s = 10, which we have not plotted here, is preferable to the BL method at N = 2T. The two ⋆-marks are for the BL method at N = 8T. Three more data sets for this case were given by Table 4.1 of [3], but we did not plot them here, since these had to be placed further away to the right than the two ⋆-points plotted here, while even being slightly higher. The two ⋆-marks are positioned slightly lower than the lowest point of the curve for the perfect fuzzy Hellman at s = 30. Being lower indicates a smaller T -M tradeoff coefficient, but this should not to be confused with being faster in DLP solving. As an example, let us consider the lowest perfect fuzzy Hellman point at coordinates (2.03, 2.08) and the best ⋆-coordinates of (21.7, 1.92). To achieve identical online DLP solving time, using parameters that correspond .7 times more pre-computation effort than the perfect fuzzy to these coordinates, the BL method must invest 10.7 = 21 2.03 Hellman method. On the other hand, the table sizes required to store the results of these two sets of pre-computations will differ only by a factor of 1.08 = 21..08 , with the BL method at an advantage. More explicitly, Table 4.1 of [3] reports the 92 experimentally obtained values PBL = 5.966 × 109 , TBL = 1.026 × 106 , and MBL = 512, on a group of size q = 248 − 313487. According to Theorem 9, (16), and (17), the perfect fuzzy Hellman tradeoff, executed with parameters m = 1086,
(P
T
)
Please cite this article as: J. Hong and H. Lee, Solving discrete logarithm problems faster with the aid of pre-computation, Discrete Applied Mathematics (2019), https://doi.org/10.1016/j.dam.2019.03.023.
J. Hong and H. Lee / Discrete Applied Mathematics xxx (xxxx) xxx
17
t = 20944, and s = 30, is expected to bring about PPFH = 5.567 × 108 , TPFH = 1.026 × 106 , and MPFH = 557.2. The 966×109 .2 = 1.088 are as we have claimed. ratios 55..567 = 10.72 and 557 512 ×108 The BL method in the N = 8T setting can achieve slightly better online efficiency than the fuzzy Hellman method, but the amount of pre-computation that must be invested in order to make use of this advantage will be much larger than that required by the fuzzy Hellman method. It seems reasonable to expect the large pre-computation advantage to be preferred over the small online resource advantage, when implementers are confronted with the two explicit figures we have briefly discussed. However, the opposite choice could still be made in some situations, depending on the intended application and the implementation environment. When the three groups of data points plotted for the BL method are considered, and the locations of other BL method data points are imagined through interpolation and extrapolation of the three data groups, one may safely conclude that the fuzzy Hellman method is likely to be preferred to the BL method, when pre-computation cost, online time, and storage size are all taken into account. However, when the cost of pre-computation can be ignored completely, the BL method is at an advantage. To be fair, we acknowledge that the BL method remains to be developed further. The main idea of the BL method was to single out the more useful DP chains from a larger pool of chains in creating the pre-computation table, but the BL method did not completely specify the rules for choosing which chains to retain. Since the specific rules used in producing the experimental data of [3] were based on heuristics, there remains the possibility for the BL method to show better performance later, with the discovery of better rules for chain selection. 6.5. Number of bits allocated to each table entry The comparisons of the previous subsection may seem reasonable, but we must clarify that we have hidden one detail that needs to be taken into account during comparisons. Up to this point, the storage requirement of each algorithm was represented by the number of pre-computation table entries, and we had implicitly assumed that each pre-computation table entry consisted of a full record of the group element and its discrete logarithm value. However, some of this information can be discarded with very little impact on the performances of the DLP algorithms, and different algorithms may respond differently to such storage reduction techniques. The work [3] presented many techniques that can be used to reduce the number of bits required to store each table entry. Let us now explain that each of these techniques may also be applied to the fuzzy Hellman method, if at most log s more bits, than is necessary for the DP or BL method, are allocated to each of its pre-computation table entries. Since we have observed that there is very little need to use an s that is over 5 bits long, our previous ignoring of the number of bits allocated to each table entry would be justified to a large extent. During the discussion below, we will assume that the DP, BL, and fuzzy Hellman methods are all running under comparable parameters. In other words, we assume that a common, or similar order, pre-computation table count M is used by the three algorithms, and that the distinguishing properties are chosen so that tDP ≈ tBL ≈ tFH sFH , which would ensure that the three algorithms exhibit online time complexities of similar order. We will recall two specific techniques from [3] and explain our claim through these examples. Once these are understood, applicability of other techniques to the fuzzy Hellman method will be evident to anyone that knows these other techniques. The first storage reduction technique we recall is to simply truncate each group element record to log M bits. Then each table lookup is expected to return a single match, which might actually only be a partial match and a false alarm. The crucial observation is to note that the verification of whether a match is a false alarm requires only a single exponentiation that checks for the correctness of the suggested DLP solution. Since the cost of an exponentiation is very small in comparison to the effort of creating an online DP chain, the extra computation is unlikely to affect the online time visibly. Now, compared to the DP and BL methods, the fuzzy Hellman method is expected to make sFH times more frequent table lookups. Since sFH will be small, the cost of these sFH table lookups and sFH exponentiations should still be negligible in comparison to T ≈ tFH sFH multiplications, the cost of an online chain creation. In the unlikely case where the sFH times more frequent exponentiations do cause problems, one can choose to retain log M + log sFH bits, rather than log M bits, of each group element. Then only one of the sFH table lookups is expected to return a false alarm, and the frequency of exponentiations is reduced to the level expected from the DP and BL methods. The fuzzy Hellman method will still require sFH times more frequent table lookups, but its false alarm verification frequency will be similar to those of the DP and BL methods. The second storage reduction technique we consider is an extension of the first technique. Suppose one removes the group element record completely from each table entry. Then each table lookup will return M collisions and one must resolve these alarms with M exponentiations. This computation is no longer small, but the extra work might still be negligible in comparison to the DP chain generation effort, in some situations. For example, assuming that 23 log q multiplications are required per exponentiation, in the case of the DP or BL methods, satisfaction of 32 M log q ≪ tDP ≈ tBL allows the extra cost of exponentiations to be ignored. An analogue condition for the t fuzzy Hellman method would be 32 M log q ≪ tFH ≈ sDP , which is less likely to be met than the condition for the DP and FH BL methods. However, if log sFH bits of group element information are restored to each table entry, collisions returned by t M each table lookup reduce from M to s , and the condition changes to 23 sM log q ≪ tFH ≈ sDP . The fuzzy Hellman method FH FH FH is now similar to the DP and BL methods in terms of the frequency of extra exponentiations. Please cite this article as: J. Hong and H. Lee, Solving discrete logarithm problems faster with the aid of pre-computation, Discrete Applied Mathematics (2019), https://doi.org/10.1016/j.dam.2019.03.023.
18
J. Hong and H. Lee / Discrete Applied Mathematics xxx (xxxx) xxx
Of course, in both of the techniques discussed, the addition of all log sFH bits of group element information is not always necessary. It suffices to add only as many bits as is necessary to bring the cost of exponentiations down to a degree that is negligible in comparison to the main chain creation costs. Addition of the full log sFH bits only ensures that the fuzzy Hellman method is at the same level as the comparably parameterized DP or BL method in terms of the false alarm resolving cost. There is one storage optimization technique explained in [3] that is not covered by the above discussion. The technique is to discard a large number of less significant bits of the discrete logarithm information from each table entry and to recover the lost information during the online phase by treating each of them as another DLP instance defined on a smaller domain. This recursive DLP treatment requires the basic DLP solving algorithm to be able work on intervals, as with the Pollard’s kangaroo (lambda) algorithm. The kangaroo style analogue of the fuzzy Hellman methods is currently not available, and we cannot provide a similar recursion. However, this storage reduction technique can still be used in the form where the top level DLP is handled by the fuzzy Hellman method and the lower level recursions are carried out by the kangaroo versions of the DP or BL method. In fact, since the cost of pre-computation will be of lesser importance for these smaller DLPs, the BL method with an aggressive table shrinking setting could be more reasonable for the recursion steps. In summary, we can claim that the fuzzy Hellman method is almost as competent as the DP or BL methods in view of the number of bits required to store each pre-computation table entry. In the worst case, the fuzzy Hellman method may require up to log s more bits, which is quite small, to store each table entry than the DP or BL method. 7. Conclusion In this work, we presented the fuzzy Hellman method, which is a new algorithm for solving DLPs faster after a one-time pre-computation phase. Our algorithm increases the practicality of using cryptographic schemes that rely on the trapdoor discrete logarithm groups of the RSA setting. We presented a full complexity analysis of our algorithm and compared its performance with those of the existing pre-computation aided DLP solving algorithms. As illustrated in Appendix D, the accurate formulas obtained through our complexity analysis allow an implementer of the fuzzy Hellman method to choose parameters that are most appropriate for the intended application and implementation environment. The tight control over the DLP method’s online phase behavior given to the implementer at the parameter selection stage is of practical importance, since the pre-computation phase can often be too costly to be repeated. Our new algorithm is clearly advantageous over the DP method, which is essentially the only widely known algorithm for DLP solving with the aid of pre-computation. We can further claim that, when both pre-computation cost and online efficiency are taken into account, the fuzzy Hellman method is preferable to the more recent BL method. In an attempt to make the comparison of algorithm performances reasonable and practically meaningful, we took all three complexity aspects of the algorithms, namely, pre-computation time, online time, and storage size, into account. However, the condensed conclusions we have given above should not be taken as final, as there are certain limitations to our comparisons. For example, performances of the three compared algorithms on parallel processing platforms, such as GPUs, are clearly outside the scope of this work. In fact, except in the multi-target setting, none of the three methods allow for meaningful parallel processing of their online phases. One must at least break away from the normal guidelines for choosing their respective parameters, namely, the matrix stopping rules, and this modification changes the behaviors of the algorithms significantly, necessitating a separate analysis. The analysis given in this work assumed a single-core environment and largely ignored the fact that the table lookup frequency of our algorithm is higher than those of the previous algorithms, although by only a small factor. Despite the limitations of our performance comparison, our findings may be taken as strong encouragement to consider the use of the fuzzy Hellman method in a wide range of real-world situations. Also, even though our explicit complexity formulas cannot be applied to parallel processing situations, the theoretical arguments that lead to our conclusions will be the starting point from which one can work to predict the behavior of our algorithm on various specific implementation platforms and calculate parameters that are optimal for the targeted application situation. There is an interesting direction of study which is closely related to the current work. This is the pre-computation aided algorithms for solving DLPs in short intervals, i.e., the analogues of the Pollard’s kangaroo (lambda) algorithm. In fact, the article [3] that proposed the BL method was mainly focused on the short interval DLPs. One less visible difference between the DP method and the fuzzy Hellman method is in the choice of the starting point of a new online DP chain, after a previous DP chain segment has failed to produce the DLP solution. The fuzzy Hellman method requires the multiple online chain segments to be serially connected and not freshly started each time. This hidden difference prevents a straightforward kangarooization of the fuzzy Hellman method from inheriting the advantageous characteristics brought by the structure of the fuzzy Hellman pre-computation matrix. Creating a variant of the fuzzy Hellman method that is suitable for the short interval DLPs seems to be an interesting problem. Acknowledgments JH was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (NRF-2012R1A1B4003379). HL was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (NRF-2012R1A1A2008392). Please cite this article as: J. Hong and H. Lee, Solving discrete logarithm problems faster with the aid of pre-computation, Discrete Applied Mathematics (2019), https://doi.org/10.1016/j.dam.2019.03.023.
J. Hong and H. Lee / Discrete Applied Mathematics xxx (xxxx) xxx
19
Appendix A. Pollard’s rho algorithm This section presents a short introduction to the Pollard’s rho algorithm [33] for solving DLPs. The algorithm discussed in this section does not involve a pre-computation phase. This section need not be read by anyone familiar with the rho algorithm and related concepts, such as r-adding walks and distinguished points. We will let G = ⟨g⟩ denote a finite cyclic group of prime order q, generated by an element g, and the DLP target will always be set to h ∈ G. More precisely, a specific encoding scheme for presenting elements of G is fixed and, given any h ∈ G expressed in this scheme, one aims to find the integer 0 ≤ x < q such that h = gx . A.1. Pollard’s classical algorithm Let us start by describing the Pollard’s iteration function FP : G → G. One first fixes a partition of G into three roughly equal sized subsets. The partition G = G1 ∪ G2 ∪ G3 should be such that, given a group element, expressed in the encoding scheme for G, it is easy to identify which subset Gi the element belongs to. With the partition fixed, the iteration function is defined as follows.
⎧ ⎨gy, if y ∈ G1 , FP (y) = y2 , if y ∈ G2 , ⎩ hy, if y ∈ G3 .
(A.1)
To solve a given DLP instance h, one prepares a starting point g0 = ga0 hb0 , where 0 ≤ a0 < q and 0 < b0 < q are chosen at random, and uses this to iteratively compute the sequence (gi )i≥0 through the rule gi+1 = FP (gi )
(i ≥ 0).
(A.2)
Since G is a finite set, the infinite sequence (gi )i≥0 of elements from G cannot all be distinct. Thus, an element that appeared before must eventually be revisited, after which the sequence can only retrace the previously traversed steps. Now, note that FP has been defined in such a way that one can easily keep track of integers ai and bi (modulo q) satisfying gi = gai hbi . Hence, as soon as a collision gi = gj is discovered, one can solve the linear congruence equation ai + bi x ≡ aj + bj x
(mod q)
(A.3)
and recover the solution x = logg h to the given DLP instance h. More generally, given any fixed operator F : G → G and some initial element g0 ∈ G, one can consider the infinite sequence (gi )i≥0 of elements from G defined iteratively through gi+1 = F (gi ). As before, since G is a finite set, a collision is inevitable. Let µ ≥ 0 and λ ≥ 1 be the smallest integers that satisfy gλ+µ = gµ for the sequence. It is known [9,26] that, when the operator √ F is chosen √uniformly at random from the set of all functions operating on G, the rho length λ + µ is expected to be π2 q ≈ 1.253 q. Even though Pollard’s iteration function FP is not a random function and tests indicate √ that it takes more than π2 q iterated applications of FP for the created sequence to reach a collision [41,42], the rho length √ of the associated sequence has been shown [19] to be of O( q) order. A.2. r-adding walks The computational complexity of the rho method could be reduced if one could replace Pollard’s iteration function √ with another iteration function for which the expected rho length of the associated sequence is closer to π2 q. Let r be a small positive integer and let us fix a partition G = G1 ∪ · · · ∪ Gr of G into r-many subsets of similar sizes. The index function ι : G → {1, 2, . . . , r } for this partition is defined by setting ι(y) = i for y ∈ Gi . For each i = 1, . . . , r, integers 0 ≤ αi , βi < q that are not both zero are chosen at random and the multipliers mi = gαi hβi are computed. Finally, the r-adding iteration function Fr : G → G is set to Fr (y) = ymι(y) .
(A.4)
That is, one of the r-many randomly chosen, but pre-determined, elements mi ∈ G is multiplied to the input, depending on which subset Gi the input belongs to, so that the r-adding walk is given by gi+1 = Fr (gi ) = gi mι(gi ) . As with the Pollard’s original iteration function FP , one can easily keep track of the integers ai and bi satisfying gi = gai hbi . Hence, as before, any collision within the iterated sequence leads directly to the solution of the DLP instance. This method was introduced by [37] and the work [36] provided some support to the claim that a collision is expected √ after O( q) iterations when r ≥ 8. Testing [42] done on prime order elliptic curve groups showed that the average rho length of a sequence generated with a 20-adding iteration function is very close to that expected of a random function. Although it is known that the characteristics of the r-adding walks are different from those expected of walks created with a random function, the differences are known to be small on groups of prime order, as long as r is not too small. Some more information concerning the characteristics of the r-adding walks can be found in [5,8,41]. Please cite this article as: J. Hong and H. Lee, Solving discrete logarithm problems faster with the aid of pre-computation, Discrete Applied Mathematics (2019), https://doi.org/10.1016/j.dam.2019.03.023.
20
J. Hong and H. Lee / Discrete Applied Mathematics xxx (xxxx) xxx
A.3. Cycle detection It remains to discuss how one might detect a collision gi = gj appearing within the cycle part of the iterated walk. Note that it is not necessary to detect the initial collision gλ+µ = gµ , but one would still wish to detect the collision within a small distance from the point of initial collision. The most straightforward approach would be to accumulate all points √ gi in a sorted list until a collision is discovered. This method requires the storage space to be of Θ ( q) size, and there are other methods that require much smaller storage. A method that requires the minimum amount of storage space is to wait for a collision of the type gi = g2i to occur, while updating the two current states gi and g2i to gi+1 and g2(i+1) , respectively, at each iteration. This method, which requires three applications of the iteration function per iteration, is attributed to R. W. Floyd in [20, p. 7]. Another method [6] searches for the smallest power of two that is larger than both the λ and µ values mentioned previously in this section, and there are other methods [28,38]. The most widely used method of cycle detection seems to be the one involving the concept of distinguished points. The use of distinguished points was originally proposed for use with the time memory tradeoff techniques and its use in cycle detection was suggested in [34,35]. A distinguished point (DP) is any element of G that satisfies a distinguishing property, which is simply a preset condition of one’s design. In practice, the encoding scheme for the group G almost always results in its elements being represented as bit strings, and there is usually no reason to expect any non-uniform characteristics of the encoding. The typical distinguishing property in such a situation requires a fixed number of most significant bits of an element to be zero. For example, if one wants a random element of G to be a DP with probability 1 , then one could define DPs to be those elements of G whose ten most significant bits are zero. 210 To detect collisions with the DP approach, one starts with an empty table of DPs, and iteratively computes the iterated walk as usual. Whenever the current element gi is found to be a DP, the element is searched for in the table of DPs and is added to the table if it is not found. If 1t is the probability for a random element of G to be a DP, then the DP collision detection method is expected to require t extra applications of the iteration function after the initial collision within the walk, before terminating with a collision from the table of DPs. The probability 1t is chosen so that the table of DPs is kept at a manageable size, while the number of extra iterations does not become too large. Appendix B. Distinguished point method In this section, we describe the most natural approach to solving DLPs with the aid of pre-computation. The approach was referred to as the DP method in the main text of this paper. We assume the reader is familiar with how the Pollard’s rho algorithm [33] (without pre-computation) leads to a solution of a given DLP, the r-adding walks, and the concept of distinguished points (DPs). All of these were covered in Appendix A, and we will continue to work with the notation used there. In particular, we assume a cyclic group G = ⟨g⟩ of prime order q and write the DLP instance as h. Let us start by describing the overall structure of the DP method, which is an adaptation of the rho algorithm. We assume that a suitable iteration function F : G → G and a distinguishing proper have been fixed. A finite sequence (gi )ni=0 of elements of G, defined iteratively through gi+1 = F (gi ) and often visualized as F
F
F
g0 −→ g1 −→ · · · · · · −→ gn ,
(B.1)
is said to be a chain of length n having starting point g0 and ending point gn . If the ending point gn is a DP, but none of the previous elements gi (0 < i < n) were, then the chain is said to be a DP chain. (Whether or not to apply the non-DP condition to the starting point g0 would be a design choice.) In the pre-computation phase, a large number of DP chains are generated and the ending points of these chains, together with their corresponding discrete logarithm values, are recorded as the pre-computation table. During the online phase, another DP chain that starts from the DLP target h (or its modification) is created and the ending point of the chain is compared with the ending points recorded in the table. As with the Pollard’s rho method, a match implies that a certain linear congruence relation holds and this leads directly to the solution of the given DLP instance. Note that a match of ending points of two chains occurs if and only if two chains merged together somewhere between their starting and ending points. It should be obvious to anyone that understands the basic mechanism of Pollard’s rho algorithm that, for the DP method to work, the iteration functions used in the pre-computation and online phases have to be the same. Since the pre-computation must be carried out without the knowledge of the DLP instance h, Pollard’s original iteration function, which depends on h, cannot be used. One solution to this problem is to use a special form of the r-adding walk iteration function. Specifically, instead of using multipliers of the form mi = gαi hβi , one uses multipliers of the form mi = gαi , so that the function is defined independently of the DLP instance h. Note that, as long as the exponents αi are chosen at random, since both gαi and gαi hβi are random elements of the group G, characteristics of the r-adding walks do not depend on which type of multipliers are in use. Other constructions that do not depend on h are possible, such as the mixing of the squaring operation into this special form of r-adding walks. The r-adding walk iteration function made its first appearance in the public literature through [37], where multipliers of the form gαi were used in an algorithm designed to factorize integers. The article [36] analyzed the characteristics of the iterated walk and credited H. W. Lenstra, Jr. for proposing the r-adding walk. The paper also mentioned DLP solving αi,1 αi,k as another application for the r-adding walks. Multipliers of the form mi = g1 · · · gk were introduced by [40], which studied the structure of abelian groups, and this took the form gαi hβi in the context of DLP solving in [41]. Most current Please cite this article as: J. Hong and H. Lee, Solving discrete logarithm problems faster with the aid of pre-computation, Discrete Applied Mathematics (2019), https://doi.org/10.1016/j.dam.2019.03.023.
J. Hong and H. Lee / Discrete Applied Mathematics xxx (xxxx) xxx
21
uses of the r-adding walk in relation to solving DLPs will follow this approach, but multipliers of the form gαi , that do not depend on h, have also been used [3,8,14,22,30] many times in relation to DLPs. In fact, Pollard’s kangaroo method [33] for solving DLPs on short intervals, which predates [37], uses an r-adding walk with multipliers of the form gαi , although the exponents αi were not chosen at random. Let us now go into more details of the DP method. To prepare for the DP method, one first fixes an iteration function F : G → G that does not require knowledge of the DLP instance h for its computation. The structure of F should be such that the exponent forms of the iteratively produced elements are traceable, as was with FP and Fr that were discussed in the previous section. After choosing positive integer parameters m and t subject to the matrix stopping rule mt 2 ≈ q, a distinguishing property that is satisfied by a random element of G with probability 1t is fixed to define DPs. This completes the overall setup for the DP method. Algorithm 3: Pre-computation phase of the DP method input : cyclic group G = ⟨g⟩ with generator g and fixed encoding scheme, parameters m and t satisfying mt 2 ≈ q, distinguishing property of probability 1t , and an r-adding walk setting consisting of index function ι : G → {1, . . . , r }, multipliers mi = gαi , and exponents αi . output: pre-computation table for DP method Randomly choose m distinct integers 0 ≤ c1 , c2 , . . . , cm < q;
SP ← {(gci , ci )}m i=1 ; EP ← empty list; for each (y, c) ∈ SP do
repeat c ← c + αι(y) (mod q); y ← Fr (y) = ymι(y) ; until y is a DP; Add (y, c) to EP; end Sort EP with respect to y-component; Remove EP of any duplicates; Return EP as pre-computation table;
To begin the pre-computation phase, one chooses m elements of G and designates these as the starting points of what would eventually become the pre-computation DP matrix. These elements are taken to be of the form gc , with integers 0 ≤ c < q chosen at random. Chains are generated from each of these starting points through repeated applications of the iteration function F , while keeping track of the discrete logarithm values of the elements. Each of the m pre-computation chains is continued until the first DP of the chain is observed. These DPs are referred to as the ending points of the pre-computation chains. The ending points of all the DP chains are gathered, together with their corresponding discrete logarithm values, into one table. The table is sorted according to the ending point DPs to facilitate later lookups, and any duplicate information is removed before being recorded to long term storage as the pre-computation table. The sorting process requires much less effort than the generation of all the pre-computation chains, and the cost of sorting is usually ignored during algorithm complexity analyses. It is also possible to use a hash table structure instead of sorting the data. The collection of all chains that were generated while creating the pre-computation table, or the gathering of all group elements that appeared in these chains, together with imaginary arrows between them signifying the iterated applications of F : G → G, is the pre-computation DP matrix. The pre-computation phase that was just described is summarized in Algorithm 3. To make the presentation more concrete, we assumed the use of the r-adding walk iteration function Fr : G → G with multipliers mi = gαi , as was given by (A.4). The online phase starts when the DLP instance h is given. One randomizes the DLP instance by setting the starting point to x0 = ga0 hb0 , for some randomly chosen integers 0 ≤ a0 < q and 0 < b0 < q, and then generates the online chain (xi )ni=0 through iterated computations of xi+1 = F (xi ), using the exact same iteration function that was used in the pre-computation phase. The online chain is iteratively computed until the first DP is observed, and this ending point is searched for in the pre-computation table. Since F was chosen so that one can keep track of the exponents ai and bi such that xi = gai hbi , if a match gan hbn = gc between an online chain element xn and an entry gc from the pre-computed table is found, then the discrete logarithm of h = gx can easily be computed from the linear equation an + xbn ≡ c (mod q). Note that not only the representation of gc in some fixed encoding scheme for G, which allows for the discovery of the match, but also the exponent c itself can be retrieved from the pre-computation table. If the table search did not return a match, one generates another online chain. Either the ending point of the previous online chain that did not produce a collision or a freshly randomized starting point of the form ga0 hb0 could be used as the starting point of the new online chain. The creation of more online DP chains is continued until the given DLP instance is solved. A basic birthday paradox argument that utilizes the matrix stopping rule mt 2 ≈ q shows that a collision and a solution to the DLP is likely to be Please cite this article as: J. Hong and H. Lee, Solving discrete logarithm problems faster with the aid of pre-computation, Discrete Applied Mathematics (2019), https://doi.org/10.1016/j.dam.2019.03.023.
22
J. Hong and H. Lee / Discrete Applied Mathematics xxx (xxxx) xxx
Algorithm 4: Online phase of the DP method input : cyclic group G = ⟨g⟩ with generator g and fixed encoding scheme, DLP target h, pre-computation table EP = {(y, c)}y , distinguishing property, and an r-adding walk setting consisting of index function ι : G → {1, . . . , r }, multipliers mi = gαi , and exponents αi . output: discrete logarithm of h Randomly choose integers 0 ≤ a < q and 0 < b < q; x ← ga hb ; repeat a ← a + αι(x) (mod q); x ← Fr (x) = xmι(x) ; until x is a DP; if x = y for some (y, c) ∈ EP then Solve for x in a + bx ≡ c (mod q); Return x as DLP solution; else Either (a) start over with freshly generated a and b; or (b) go back to the repeat-until loop with the current x, a, and b; end
produced by the DP method with a small number of online chains. The online phase is described formally by Algorithm 4, where we are assuming, as before, the use of the r-adding walk iteration function. Technically, one must be aware that it is possible for a chain generated during either the pre-computation phase or the online phase to fall into an infinite loop and never reach a DP. Hence, any real-world implementation must employ a mechanism for dealing with this situation, and something as simple as a suitably large chain length bound could be used for this purpose. On the other hand, for parameters of interest, occurrences of these loops are infrequent and the possibility of encountering loops can be ignored during complexity analyses. Appendix C. Proof of Lemma 1 A proof of Lemma 1 appeared in the manuscript [22], which is unlikely to appear in a more formal publication. Here, we provide another proof that is simpler than that of [22]. ¯ distinct starting points, with a distinguishing It is known [16, Prop. 10] that a non-perfect DP matrix, created from m ¯ ) × mt distinct entries, where Φ is as defined by (3). We clarify property of probability 1t , is expected to contain Φ ( m m that this claim refers to the number of DP matrix entries and not to the number of distinct pre-computation table entries, ¯ 2 = Θ (q), the proof contained which was given by Lemma 2. We also remark that, although the work [16] assumed mt ¯ 2 = O(q). therein holds true even when mt Based on the stated fact, we can claim that the probability for a chain that is created through iterations of the random function to reach a DP without merging into a randomly created DP matrix is 1 t
¯ ¯ ( ( Φ( m )mt ) 1 Φ( m )mt )2 1 1 1 m m + 1− − + 1− − + ···,
t
q
t
t
q
t
(C.1)
which is precisely the second probability stated by Lemma 1. The first probability is simply the complement of the second probability. Appendix D. Choosing parameters This section illustrates how the results of the theoretical analysis carried out in this paper can be used to choose parameters for the fuzzy Hellman method that are most appropriate for the implementation environment and application under consideration. It will become evident that our accurate complexity and tradeoff formulas are invaluable tools for parameter selection, especially when the pre-computation phase is too costly to be repeated multiple times. Let us consider the situation where one wishes to implement a trapdoor discrete logarithm group of the RSA setting for use on a modern PC. We will make a few assumptions: (a) The design of the trapdoor discrete logarithm group is such that each given DLP requires the owner of the trapdoor information to solve a smaller DLP in a group of order very close to 270 ; (b) A single core of the modern PC is capable of performing 220 group operations or DLP solving iteration function walks per second; (c) All operations required to solve a given DLP, other than the DLP solving in the 270 -order group, add negligible wall-clock time to the target DLP solving on the multi-core modern PC. The numbers 270 and 220 to be used throughout this section have been fixed for illustrative purposes, and we make no claims as to how realistic these are. In Please cite this article as: J. Hong and H. Lee, Solving discrete logarithm problems faster with the aid of pre-computation, Discrete Applied Mathematics (2019), https://doi.org/10.1016/j.dam.2019.03.023.
J. Hong and H. Lee / Discrete Applied Mathematics xxx (xxxx) xxx
23
fact, these two figures will be very different for every situation and no single set of explicit numbers can be claimed to be typical. Let us start with a discussion of the very rough complexity figures. A straightforward approach to the DLP in the group of order 270 requires 235 function iterations. This translates to 9.1 h of computation at 220 function iterations per second, which is quite impractical for everyday use. According to the approximate tradeoff curves (1), a pre-computation aided DLP solving algorithm can perform with the more acceptable figures of P ≈ 243 ≈ 24.3 days × 4 cores,
T ≈ 227 ≈ 2.1 mins,
and M ≈ 216 ≈ 1.3 MBs.
(D.1)
Here, the figure 24.3 days assumes that all four cores on the modern PC are utilized during the pre-computation phase, and the figure 1.3 MB assumes that 20 bytes are allocated to each table entry, which is generously large in view of the storage reduction techniques briefly mentioned in Section 6.5. A pre-computation period of 24.3 days would be painful for the intended user to endure, but let us assume that the implementer decides to accept these figures, mainly because convincing the user to accept an online DLP solving time that is greater than 2.1 min would be equally difficult. If the contents of this work were not available, the implementer would recall that the complexities of the DP method satisfy T = Θ (t) and M = Θ (m) and guess that the DP method implemented with parameters m = 216
and t = 227
(D.2)
would achieve the complexities (D.1), at least approximately. However, according to (5), (6), and (7), the complexities corresponding to the parameters (D.2) are P = 243 ≈ 24.3 days × 4 cores,
T = 2.36603 × 227 ≈ 5.0 mins, and M = 0.732051 × 216 ≈ 0.92 MBs.
(D.3)
Hence, the online time the user would experience is 2.37 times larger than what the implementer had anticipated, and the user would not be happy. Furthermore, any subsequent trial-and-error approach to rectifying this situation would surely be frustrating, as each trial requires a month-order of pre-computation phase. If the implementer had heard of the perfect fuzzy Hellman method and its advantage over the DP method, but did not have access to our accurate complexity formulas, the parameters 227
, and s = 30, 30 may have been tried. According to Theorem 9, (16), and (17), these parameters lead to the complexities m = 216 ,
t=
P = 0.868438 × 243 ≈ 21.1 days × 4 cores, and
T = 1.86989 × 227 ≈ 4.0 mins,
M = 0.633444 × 216 ≈ 0.79 MBs,
(D.4)
(D.5)
which are better than (D.3). However, it still remains that the implementer’s control over the online time is very limited. Let us next see what can be done with the accurate complexity and tradeoff formulas provided in this paper. Since the pre-computation table will be small enough to be loaded into the main memory of a modern PC, table lookups should not affect performance and the use of the perfect fuzzy Hellman method with s = 30 (or larger) is reasonable. Also, since the above rough figures have shown that the user will be much more lacking in pre-computation time and online time 2 than in storage space, it is advisable to work with parameters that lean toward a smaller PT value than an optimal T qM q value. 2 After trying out a few mtq values on (22) and (23) while referencing Fig. 1, the implementer can decide to use mt 2 q
= 0.0004 and s = 30, which corresponds to the tradeoff curves PT = 1.20334 × 270
and T 2 M = 3.77029 × 270 .
(D.6) 43
27
The P-T relation makes it clear that it is not possible to obtain P = 2 and T = 2 simultaneously. By taking any one of P, T , and M as an independent variable and computing the other two through the above two relations, the implementer can tweak the parameters to make an appropriate compromise of the three complexities. Let us assume that the implementer decides to be satisfied with P = 1.02685 × 243 ≈ 24.9 days × 4 cores, and M = 2.74544 × 216 ≈ 3.4 MBs.
T = 1.17188 × 227 ≈ 2.5 mins,
(D.7)
Then, referring to Theorem 9 and (16), the implementer can write T = 105.903 × t
and M = 0.840420 × m,
(D.8)
and thus obtain the implementation parameters t = 1.48519 × 106 ,
m = 2.14090 × 105 ,
and s = 30,
(D.9)
that correspond accurately to the complexities (D.7). Please cite this article as: J. Hong and H. Lee, Solving discrete logarithm problems faster with the aid of pre-computation, Discrete Applied Mathematics (2019), https://doi.org/10.1016/j.dam.2019.03.023.
24
J. Hong and H. Lee / Discrete Applied Mathematics xxx (xxxx) xxx
Results of our theoretical analyses did not allow the implementer to achieve the original target complexities (D.1), at least not exactly, as they were outside the capabilities of the DLP methods. However, our theory can show that certain complexities are impossible to achieve simultaneously and help the implementer make an informed decision on the most acceptable compromise. The discussion given so far had assumed that the pre-computation phase was to be carried out by the user on a modern PC. A very different set of parameters could be more appropriate if the user is willing to buy external CPU time for the pre-computation phase after weighing the possibility of leaking the trapdoor information through this process. For example, the approximate complexities P ≈ 248 ≈ 3.1 days × 103 cores,
T ≈ 222 = 4.0 s, and M ≈ 226 ≈ 1.3 GBs,
(D.10)
seem to be a reasonable suggestion. Here, we are assuming that the cores to be rented for the pre-computation phase can each compute 220 iterations of the DLP solving random walks and that each table entry requires 20 bytes. With the 2 aim of making the most efficient use of the online resources, the implementer fixes s = 30 and locates the mtq value that minimizes the right-hand side of (22). The correct value is become PT = 2.02869 × 270
mt 2 q
= 1.69286 × 10−3 and the tradeoff curves (22) and (23)
and T 2 M = 2.08260 × 270 .
(D.11)
The implementer can make the compromise P = 1.62295 × 248 ≈ 5.0 days × 103 cores, and M = 1.33286 × 226 ≈ 1.7 GBs,
T = 1.25000 × 222 = 5.0 s,
(D.12)
and the parameters m and t that are required to achieve these complexities can be computed as before. Assuming that the user is comfortable with the monetary cost of carrying out the above pre-computation, one can attempt to reduce the online time further. As the online time is made smaller, at some point, the size of the precomputation table will become too large to fit in the main memory of the intended online system. Coupled with the very short online time, even the small number of disk accesses will eventually become a bottleneck for the online phase. If one wishes to minimize the online wall-clock time, one should push for this limit, and then also consider using the non-perfect fuzzy Hellman method which works well with a smaller s. Appendix E. Fuzzy Hellman method at t = 1 Setting t = 1 with fuzzy Hellman is equivalent to defining every point to be a DP, and this setting implies a large s. Thus, despite the completely impractical table lookup frequency, some might vaguely relate the t = 1 case with a theoretically optimal version of fuzzy Hellman, when the main body of this paper explicitly excluded the treatment of the case with the small-s requirement. However, through an initial rudimentary theoretical treatment of the t = 1 case, we could confirm that the arguments involved with it would not be the extreme limiting case adaptations of those for the small-s case, but rather a detached parallel discussion. In other words, one cannot interpret the t = 1 case as some large-s limiting case, at least not in any strictly logical sense. Furthermore, our previous statement that a larger s gives better performance was only an observation made of the few explicit upper level tradeoff curves and not a logical consequence of our theoretically obtained small-s formulas. Nevertheless, to explicitly verify that we are not discarding a good idea, in this section, we give a brief treatment of the t = 1 case performance through an experimental approach. A full theoretical treatment would require efforts and pages comparable to those that were required by most of Sections 4 and 5. With the confidence gained through the experiments of Section 5, all experiments in this section were carried out with the SHA-1 hash function as our iteration function. Simple bit masking was used to restrict the output space to a manageable size of q = 232 . Note that, unlike the r-adding walks, properties of the random function or SHA-1 should not be sensitive to whether or not q is a prime number. Since no DLPs can be solved with the SHA-1 iteration function, the online time complexity was taken to be the length of the online chain at the point it collided with an entry of the pre-computation table. Any online chain that did not produce a collision within 20s iterations was classified as failing to solve a DLP. Every test complexity figure we give below is an average obtained over 200 pre-computation tables and 1000 online chain creations per table. The online chain iterations spent on the failing DLP instances have been included in the average online time complexities. Non-perfect case. The online phase of the non-perfect fuzzy Hellman method at t = 1 requires uncomfortably frequent table lookups, but this may still be practical to use when the pre-computation table is small enough to fits within the main memory of the online system. Before gathering the data required to draw the analogue of Fig. 1 for the non-perfect fuzzy Hellman method at t = 1, 2 we first experimented with using parameter sets that would give equal msq values, and the results are summarized in Table E.4. We clarify that the rightmost two columns of the table were computed from the data of the middle columns, and are not averages obtained over the 200 pre-computation tables and 1000 online chains per table in their own right. 2 2 The test results show that the tradeoff coefficients PT and T qM may almost be seen as functions of msq , rather than q separately of m and s. We can also observe from the
M m
column of Table E.4 that merges between the pre-computation
Please cite this article as: J. Hong and H. Lee, Solving discrete logarithm problems faster with the aid of pre-computation, Discrete Applied Mathematics (2019), https://doi.org/10.1016/j.dam.2019.03.023.
J. Hong and H. Lee / Discrete Applied Mathematics xxx (xxxx) xxx
25
Table E.4 Experimental complexities for the non-perfect fuzzy Hellman method at t = 1 and q = 232 (P: pre-computation time; T : online time; M: storage size). m
s
ms2 q
P ms
T s
M m
Failures
PT q
T2M q
1067 1344 1695 2134
1795 1599 1424 1269
0.8004 0.8001 0.8003 0.8001
0.9999 0.9999 0.9999 0.9998
1.905 1.897 1.887 1.888
0.9998 0.9997 0.9997 0.9996
0.116% 0.095% 0.069% 0.062%
1.525 1.518 1.510 1.510
2.904 2.880 2.849 2.851
1448 1826 2299 2897
2436 2169 1933 1722
2.0006 2.0001 2.0001 2.0001
0.9998 0.9998 0.9998 0.9997
1.115 1.104 1.102 1.100
0.9996 0.9996 0.9995 0.9994
0.074% 0.049% 0.032% 0.024%
2.230 2.207 2.203 2.199
2.486 2.435 2.427 2.419
Table E.5 Experimental complexities for the non-perfect fuzzy Hellman method at t = 1 and q = 232 (P: pre-computation time; T : online time; M: storage size). m
s
ms2 q
P ms
T s
M m
Failures
PT q
T2M q
1197 1371 1625 1818 1978 2115 2236
1198 1371 1626 1819 1977 2114 2235
0.4000 0.6000 1.0003 1.4006 1.8000 2.2007 2.6006
0.9999 0.9999 0.9998 0.9998 0.9998 0.9997 0.9997
3.170 2.327 1.639 1.337 1.168 1.055 0.974
0.9999 0.9998 0.9997 0.9996 0.9996 0.9994 0.9994
0.266% 0.107% 0.068% 0.040% 0.039% 0.037% 0.032%
1.268 1.396 1.639 1.872 2.103 2.321 2.532
4.018 3.248 2.686 2.501 2.457 2.448 2.466
; y-axis: Fig. E.3. Fuzzy Hellman method at t = 1 versus at a reasonable value of s. x-axis: PT q (dots) and s = 5 (line); RHS box: perfect fuzzy Hellman at t = 1 (dots) and s = 30 (line).
T2M q
; LHS box: non-perfect fuzzy Hellman at t = 1
chains within the same column of the pre-computation matrix are rare, so that the matrix for the non-perfect fuzzy Hellman method at t = 1 is very close to a classical Hellman matrix, which was partially treated in [15]. Based on our first set of tests, we opted to restrict ourselves to parameters such that m ≈ s during our second set 2 of tests, which gathered the data corresponding to more values of msq , and the results are given in Table E.5. The data contained in the rightmost two columns of the table have been plotted as dots in the left-hand side box of Fig. E.3. The box also displays the lowest curve from Fig. 1, corresponding to s = 5, for easy reference and comparison. Perfect case. The pre-computation phase of the perfect fuzzy Hellman method at t = 1 requires a temporary storage of size similar to the pre-computation time complexity, and the computational effort of removing duplicates during the pre-computation is too large to be ignored. Thus, the perfect fuzzy Hellman method at t = 1 is completely impractical to use, regardless of whether the frequent table lookups during the online phase are acceptable. 2 That the tradeoff coefficients PT and T qM for the perfect fuzzy Hellman method at t = 1 may almost be seen as q 2
functions of msq can be checked from Table E.6. The upper level tradeoff performance of the algorithm is summarized in Table E.7, and the right-hand side box of Fig. E.3 displays this information as a graph. For easy reference, we have also placed the upper level tradeoff curve for the s = 30 case of the perfect fuzzy Hellman method in the same box. Please cite this article as: J. Hong and H. Lee, Solving discrete logarithm problems faster with the aid of pre-computation, Discrete Applied Mathematics (2019), https://doi.org/10.1016/j.dam.2019.03.023.
26
J. Hong and H. Lee / Discrete Applied Mathematics xxx (xxxx) xxx
Table E.6 Experimental complexities for the perfect fuzzy Hellman method at t = 1 and q = 232 (P: pre-computation time; T : online time; M: storage size). m
s
ms2 q
P ms
T s
M m
Failures
PT q
T2M q
1046 1569 2353
1923 1570 1282
0.9006 0.9005 0.9004
0.8726 0.8724 0.8720
1.978 1.965 1.945
0.6569 0.6559 0.6548
0.201% 0.136% 0.066%
1.555 1.544 1.527
2.315 2.281 2.230
1294 1941 2912
2376 1940 1584
1.7009 1.7009 1.7011
0.7877 0.7882 0.7874
1.618 1.613 1.602
0.4715 0.4711 0.4697
0.185% 0.143% 0.102%
2.168 2.163 2.146
2.100 2.085 2.051
Table E.7 Experimental complexities for the perfect fuzzy Hellman method at t = 1 and q = 232 (P: pre-computation time; T : online time; M: storage size). m
s
ms2 q
P ms
T s
M m
Failures
PT q
T2M q
1198 1291 1444 1625 1818 1978 2115
1198 1290 1443 1626 1819 1977 2114
0.4003 0.5002 0.7001 1.0003 1.4006 1.8000 2.2007
0.9378 0.9242 0.8970 0.8605 0.8175 0.7783 0.7446
3.242 2.761 2.241 1.873 1.666 1.601 1.603
0.8228 0.7853 0.7160 0.6278 0.5325 0.4531 0.3893
0.251% 0.137% 0.114% 0.117% 0.137% 0.147% 0.186%
1.217 1.276 1.407 1.612 1.907 2.243 2.627
3.462 2.994 2.517 2.203 2.070 2.091 2.202
References [1] E.P. Barkan, Cryptanalysis of ciphers and protocols (Ph.D. thesis), Technion—Israel Institute of Technology, 2006. [2] E. Barkan, E. Biham, A. Shamir, Rigorous bounds on cryptanalytic time/memory tradeoffs, in: Advances in Cryptology—CRYPTO 2006, LNCS, vol. 4117, Springer-Verlag, Berlin, Heidelberg, 2006, pp. 1–21. [3] D.J. Bernstein, T. Lange, Computing small discrete logarithms faster, in: Progress in Cryptology—INDOCRYPT 2012, LNCS, vol. 7668, Springer-Verlag, Berlin, Heidelberg, 2012, pp. 317–338. [4] D.J. Bernstein, T. Lange, Non-uniform cracks in the concrete: the power of free precomputation, in: Advances in Cryptology—ASIACRYPT 2013, LNCS, vol. 8270, Springer-Verlag, 2013, pp. 321–340. [5] D.J. Bernstein, T. Lange, Two grumpy giants and a baby, in: ANTS X, Proceedings of the Tenth Algorithmic Number Theory Symposium, The Open Book Series, vol. 1, Mathematical Sciences Publishers, 2013, pp. 87–111. [6] R. Brent, An improved Monte Carlo factorization algorithm, BIT Numer. Math. 20 (2) (1980) 176–184. [7] A.W. Dent, S.D. Galbraith, Hidden pairings and trapdoor DDH groups, in: Algorithmic Number Theory, 7th International Symposium, ANTS-VII, LNCS, vol. 4076, Springer-Verlag, Berlin, Heidelberg, 2006, pp. 436–451. [8] A.E. Escott, J.C. Sager, A.P.L. Selkirk, D. Tsapakidis, Attacking elliptic curve cryptosystems using the parallel Pollard rho method, CryptoBytes 4 (1999) 15–19. [9] P. Flajolet, A.M. Odlyzko, Random mapping statistics, in: Advances in Cryptology—EUROCRYPT ’89, LNCS, vol. 434, Springer-Verlag, 1990, pp. 329–354. [10] S. Galbraith, F. Hess, N.P. Smart, Extending the GHS weil descent attack, in: Advances in Cryptology—EUROCRYPT 2002, LNCS, vol. 2332, Springer-Verlag, Berlin, Heidelberg, 2002, pp. 29–44. [11] D.M. Gordon, Designing and detecting trapdoors for discrete log cryptosystems, in: Advances in Cryptology—CRYPTO ’92, LNCS, vol. 740, Springer-Verlag, Berlin, Heidelberg, 1993, pp. 66–75. [12] M.E. Hellman, A cryptanalytic time-memory trade-off, IEEE Trans. Inform. Theory 26 (1980) 401–406. [13] R. Henry, K. Henry, I. Goldberg, Making a nymbler nymble using VERBS, in: Privacy Enhancing Technologies, 10th International Symposium, PETS 2010, LNCS, vol. 6205, Springer-Verlag, Berlin, Heidelberg, 2010, pp. 111–129. [14] Y. Hitchcock, P. Montague, G. Carter, E. Dawson, The efficiency of solving multiple discrete logarithm problems and the implications for the security of fixed elliptic curves, Int. J. Inf. Secur. 3 (2004) 86–98. [15] J. Hong, H. Lee, Analysis of possible pre-computation aided DLP solving algorithms, J. Korean Math. Soc. 52 (4) (2015) 797–819. [16] J. Hong, S. Moon, A comparison of cryptanalytic tradeoff algorithms, J. Cryptol. 26 (4) (2013) 559–637. [17] D. Huhnlein, M. J. Jacobson Jr, D. Weber, Towards practical non-interactive public-key cryptosystems using non-maximal imaginary quadratic orders, Des. Codes Cryptogr. 39 (2003) 281–299. [18] B.-I. Kim, J. Hong, Analysis of the non-perfect table fuzzy rainbow tradeoff, in: ACISP 2013, LNCS, vol. 7959, Springer-Verlag, Berlin, Heidelberg, 2013, pp. 347–362. [19] J.H. Kim, R. Montenegro, Y. Peres, P. Tetali, A birthday paradox for Markov chains, with an optimal bound for collision in the pollard rho algorithm for discrete logarithm, in: Algorithmic Number Theory, 8th International Symposium, ANTS-VIII, LNCS, vol. 5011, Springer-Verlag, 2008, pp. 402–415. [20] D. Knuth, The Art of Computer Programming, vol. II: Seminumerical Algorithms, Addison-Wesley, 1969. [21] F. Kuhn, R. Struik, Random walks revisited: extensions of Pollard’s rho algorithm for computing multiple discrete logarithms, in: Selected Areas in Cryptography, 8th Annual International Workshop, SAC 2001, LNCS, vol. 2259, Springer-Verlag, Berlin, Heidelberg, 2001, pp. 212–229. [22] H.T. Lee, J.H. Cheon, J. Hong, Accelerating ID-based encryption based on trapdoor DL using pre-computation, IACR Cryptology ePrint Archive: Report 2011/187, version 20120112:021951, http://eprint.iacr.org/2011/187. [23] G.W. Lee, J. Hong, Comparison of perfect table cryptanalytic tradeoff algorithms, Des. Codes Cryptogr. 80 (3) (2016) 473–523. [24] U.M. Maurer, Y. Yacobi, Non-interactive public-key cryptography, in: Advances in Cryptology—EUROCRYPT ’91, LNCS, vol. 547, Springer-Verlag, Berlin, Heidelberg, 1991, pp. 498–507. [25] U.M. Maurer, Y. Yacobi, A non-interactive public-key distribution system, Des. Codes Cryptogr. 9 (1996) 305–316. [26] A.J. Menezes, P.C. van Oorschot, S.A. Vanstone, Handbook of Applied Cryptography, CRC Press, 1997. [27] Y. Murakami, M. Kasahara, A discrete logarithm problem over composite modulus, Electron. Commun. Japan (Part III) 76 (1993) 37–46. [28] G. Nivasch, Cycle detection using a stack, Inform. Process. Lett. 90 (2004) 135–140.
Please cite this article as: J. Hong and H. Lee, Solving discrete logarithm problems faster with the aid of pre-computation, Discrete Applied Mathematics (2019), https://doi.org/10.1016/j.dam.2019.03.023.
J. Hong and H. Lee / Discrete Applied Mathematics xxx (xxxx) xxx
27
[29] P. Oechslin, Making a faster cryptanalytic time-memory trade-off, in: Advances in Cryptology—CRYPTO 2003, LNCS, vol. 2729, Springer-Verlag, 2003, pp. 617–630. [30] P.C. van Oorschot, M.J. Wiener, Parallel collision search with cryptanalytic applications, J. Cryptol. 12 (1999) 1–28. [31] P. Paillier, Public-key cryptosystems based on composite degree residuosity classes, in: Advances in Cryptology—EUROCRYPT ’99, LNCS, vol. 1592, Springer-Verlag, Berlin, Heidelberg, 1999, pp. 223–238. [32] K.G. Paterson, S. Srinivasan, On the relations between non-interactive key distribution, identity-based encryption and trapdoor discrete log groups, Des. Codes Cryptogr. 52 (2009) 219–241. [33] J.M. Pollard, Monte Carlo methods for index computation (mod p), Math. Comp. 32 (1978) 918–924. [34] J.-J. Quisquater, J.-P. Delescaille, Other cycling tests for DES, in: Advances in Cryptology—CRYPTO ’87, LNCS, vol. 293, Springer-Verlag, 1988, pp. 255–256. [35] J.-J. Quisquater, J.-P. Delescaille, How easy is collision search? Application to DES, in: Advances in Cryptology—EUROCRYPT ’89, LNCS, vol. 434, Springer-Verlag, 1989, pp. 429–434. [36] J. Sattler, C.P. Schnorr, Generating random walks in groups, Ann. Univ. Sci. Budapest. Sect. Comput. 6 (1985) 65–79. [37] C.P. Schnorr, H.W. Lenstra, Jr, A Monte Carlo factoring algorithm with linear storage, Math. Comp. 43 (1984) 289–311. [38] R. Sedgewick, T. Szymanski, A. Yao, The complexity of finding cycles in periodic functions, SIAM J. Comput. 11 (2) (1982) 376–390. [39] B. Silverman, J. Stapleton, Contribution to ANSI X9F1 working group, 1997, unpublished. [40] E. Teske, A space efficient algorithm for group structure computation, Math. Comp. 67 (1998) 1637–1663. [41] E. Teske, Speeding up Pollard’s rho method for computing discrete logarithms, in: Algorithmic Number Theory, Third International Symposiun, ANTS-III, LNCS, vol. 1423, Springer-Verlag, Berlin, Heidelberg, 1998, pp. 541–554. [42] E. Teske, On random walks for Pollard’s rho method, Math. Comp. 70 (2001) 809–825. [43] E. Teske, An elliptic curve trapdoor system, J. Cryptol. 19 (2006) 115–133. [44] S. Vanstone, R.J. Zuccherato, Elliptic curve cryptosystem using curves of smooth order over the ring Zn , IEEE Trans. Inform. Theory 43 (4) (1997) 1231–1237.
Please cite this article as: J. Hong and H. Lee, Solving discrete logarithm problems faster with the aid of pre-computation, Discrete Applied Mathematics (2019), https://doi.org/10.1016/j.dam.2019.03.023.