A new design paradigm for provably secure keyless hash function with subsets and two variables polynomial function

A new design paradigm for provably secure keyless hash function with subsets and two variables polynomial function

Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx Contents lists available at ScienceDirect Journal of King Saud Un...

5MB Sizes 0 Downloads 39 Views

Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx

Contents lists available at ScienceDirect

Journal of King Saud University – Computer and Information Sciences journal homepage: www.sciencedirect.com

A new design paradigm for provably secure keyless hash function with subsets and two variables polynomial function P. Karthik 1, P. Shanthi Bala 2 Department of Computer Science, School of Engineering and Technology, Pondicherry University, Puducherry, India

a r t i c l e

i n f o

Article history: Received 26 March 2019 Revised 9 September 2019 Accepted 8 October 2019 Available online xxxx Keywords: Provably secure hash function Polynomial hash function Subsets in hash design One-way collision resistant hash function Secured keyless hash function Subsets and polynomial function for cryptographic hash function One-way hash function using higher-order polynomials

a b s t r a c t Provably secure keyless hash function uses Random Oracle (RO) or Sponge principles for the design and construction of security-centric hash algorithms. It capitalizes the aforesaid principles to produce outcomes like MD2, MD5, SHA-160, SHA-224/256, SHA-256, SHA-224/512, SHA-256/512, SHA-384/512, SHA-512, and SHA-3. These functions use bitwise AND, OR, XOR, and MOD operators to foresee randomness in their hash outputs. However, the partial breaking of SHA2 and SHA3 families and the breaking of MD5 and SHA-160 algorithms raise concerns on the use of bitwise operators at the block level. The proposed design tries to address this structural flaw through a polynomial function. A polynomial function of degree 128 demands arduous effort to be decoded in the opposing direction. The application of a polynomial on the blocks produces an unpredictable random response. It is a fact that the new design exhibits the merits of the polynomial function on subsets to achieve the avalanche response to a significant level. The output from experiments with more than 24 Million hash searches proves the proposed system is a provably secure hash function. The experiments on avalanche response and confusion and diffusion analysis prove it is an apt choice for security-centric cryptographic applications. Ó 2019 The Authors. Production and hosting by Elsevier B.V. on behalf of King Saud University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

1. Introduction Hash or Message digest function takes arbitrary length input string of form {0, 1}* and it produces fixed length output {0, 1} n (Preneel et al., 1993; Buchmann, 2004). The size of the hash output n is relatively more compact than the size of the input string m. Therefore, a hash function performs compression on the input string to produce fixed size output. This property makes the hash function vulnerable to hash collisions with the principle of pigeon-hole (Trybulec, 1990). Fig. 1 illustrates the general working principle of the hash function. The provably secure hash function immunes hash collisions and it could be achieved through strict observation of guidelines on cryptographic hash function (Thomsen and Knudsen, 2005; 1

Author is a research scholar at Pondicherry University, Puducherry. Author is prsently working as an Assistant Professor at Pondicherry University, Department of CSE, School of Engineering and Technology, Puducherry. 2

Peer review under responsibility of King Saud University.

Bartkewitz, 2009; Al-Kuwari et al., 2010). The guidelines in perspective of security are summarized as follows:  Collision Resistance - For any two messages x and y such that x # y, then H(x) # H(y).  Pre-image Resistance - This property highlights that for a given hash value H(x), it is impracticable to find a different message y such that H(x) = H(y).  Second Pre-image Resistance - This property postulates that, for any randomly chosen two messages x and y such that x # y, then H(x) # H(y).  Non-correlation - This property enables the hash function to showcase the erratic behavior in its output. Accordingly, every single bit change in the input would affect the substantial number of output bits.  Compression - If M and N are those sizes pertaining to the input and output of the hash function respectively, then M > N.  Ease of Computation - For a given message x, it is effortless to find H(x) and very hard to retrieve x from H(x).  Near-collision Resistance - For any two different messages x and y, their hash values H(x) and H(y) should not differ by a more limited number of bits.

Production and hosting by Elsevier https://doi.org/10.1016/j.jksuci.2019.10.003 1319-1578/Ó 2019 The Authors. Production and hosting by Elsevier B.V. on behalf of King Saud University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Please cite this article as: P. Karthik and P. Shanthi Bala, A new design paradigm for provably secure keyless hash function with subsets and two variables polynomial function, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2019.10.003

2

P. Karthik, P. Shanthi Bala / Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx

Fig. 1. Working Principle of Hash Function.

The proposed design hereafter can be called as Provably Secure Subset Polynomial Function (PSSPF), which works on the aforesaid properties using subsets and the polynomial function. It processes the input as blocks at 1024-bits and produces a fixed size output of 256-bits for each subset. The individual subset is assigned initialization vector (IV) comprising 8 elements of 128-bits each. The elements of the IV are constructed from the square roots of the prime numbers between 5000 and 14000. This has to be made to gain the required number of bits for the IV elements. The linear change of powers in the polynomial function naturally offers resistance against any data modification with respect to position. Conclusively, the proposed system produces a hash output of 512-bits by manipulating the hash outputs of the subsets. Therefore, the combination of index-based subsets and higher-order polynomial function virtually prevents a rival to launch any block-level attacks. The experimental analysis proves the algorithm performs extremely well on its Avalanche property by consistently changing more than 90% of the nibbles in the hash output. The statistical analysis of confusion and diffusion proves the algorithm meets the strict avalanche criterion (Webster and Tavares, 1985). Therefore performing differential analysis on PSSPF would be difficult. The paper is summarized as follows: Section 2 deals with the related work and Section 3 specifies the design principles of PSSPF. Section 4 advocates the experimental analysis on the key attributes of the provably secure hash function. Section 5 deals with the discussion and Section 6 presents the concluding remarks and future enhancements.

2. Related work A provably secure keyless hash function was intended to provide a solution for integrity violations on remote data. Therefore, it was termed ‘Modification Detection Code’ (MDC) (Alfred et al., 1996). Hash codes were generated in the form of Hexadecimal numbers and were presumed as a compact form of larger binary strings. These codes would produce bizarre output for a minute change in the input. This property was recognized as Avalanche Effect (Kam and Davida, 1979; Feistel, 1973). Hash functions that did not satisfy this property were ignored for the cryptographic applications. The modern cryptographic keyless hash functions SHA-224/256, SHA-256, SHA-224/512, SHA-256/512, SHA384/512, SHA-512, and SHA-3 were designed to perform compression. The outcome of compression would volunteer the hash function for hash collisions. However, the problem was overcome by introducing erratic behavior in the hash output. Webster and Tavares (1985) had opined Strict Avalanche Criteria as the desirable property of the cryptographic hash function. In the light of the above property, a change in a particular bit would invariably affect more than 50% of the output bits.

Before 1989, there were no standard principles present for the design of provably secure hash function. It was for the first time Merkle (1989) and Damgard (1989), mathematically proved collision resistance property for the provably secure hash function. Their hash construction was considered as Merkle-Damgard (MD) construction. This became the standard for the hash function till 2012. In 2012, the American National Institute of Standards and Technology (NIST) came up with a new standard SHA-3, through open competition (NIST, 2012a). SHA-3 stays as the present standard for the provably secure hash function. The conventional algorithms employed RO model for generating hash output. This model was a block iterated round function which was operated by the Davies-Meyer (DM) principle (Davies and Price, 1984; Matyas et al., 1985). The round function was executed on the block to produce intermediate hash output, which was then used to modify the adjacent block to be processed. The process was extended until all the blocks in the input message were processed. Any message which was not in multiples of the block size was padded with 0 s. This would make the message size to appear in multiples of the block size. The first bit of the padding was set to 1 to differentiate the message from padding bits. The length of the message was finally appended to the tail end. Lai and Massey (1992) had coined the term MD Strengthening for message padding. He opined MD Strengthening would prevent long message attack, trivial free-start attack, and trivial semi-free-start collision attack. Rivest (1992a), devised the MD4 algorithm implementing MD principles. This algorithm was experimented using 3 auxiliary functions and IV with four elements. It was operated on 16 rounds to construct 128-bits hash output. In the aforesaid year, Rivest (1992b) invented another algorithm MD5 with 4 auxiliary functions. This was equally a 128-bits algorithm which was widely implemented in cryptographic applications till 2004. During the year 2001, Eastlake and Jones (2001) devised a new 160-bits SHA-1 algorithm and it was believed that SHA-1 would provide complete security than its predecessor MD5. But the research community acknowledged the structural weakness on MD4, MD5, and SHA-1 after the algorithms were identified as producers of hash collisions (Wang et al., 2004; Xie and Feng, 2010; Wang et al., 2005). Therefore, the algorithms MD4, MD5, and SHA-1 were forbidden for cryptographic applications. The hash collision on SHA-1 forced NIST to look for a contemporary standard through direct competition. Keccak was selected as the winner of the competition and consequently, SHA-3 was announced as the current standard for the cryptographic hash functions (NIST, 2012b). Guido Bertoni et al. (2013) replaced RO with sponge construct for the design of SHA-3 hash function. The sponge principles enabled the SHA-3 to produce variable length outputs which can be used for various cryptographic applications. Dworkin (2015) had explained the permutation-based construction principles of a family of six SHA3 functions. Benny Applebaum et al. (2017) had reduced the sequential and parallel time complexity of the hash function by minimizing the time complexity measures like degree, locality and circuit size. Bitansky et al. (2018) had introduced multi-collision resistance property to the keyless hash function. They had addressed the enduring question of the round complexity of zero-knowledge protocols. Yang et al. (2019) proposed a feedback based iterative structure for the design of the hash function. The authors had employed chaotic hashing schemes to resist various hash attacks. Teh et al. (2019) introduced chaos based keyed hash function using fixed point representation. The design was based on the chaos based cryptosystem proposed by Baptista(1998). The authors had achieved 30% performance improvement than to its chaos based peers. However, their design suffered from performance issues due to excessive use of multiplication operations.

Please cite this article as: P. Karthik and P. Shanthi Bala, A new design paradigm for provably secure keyless hash function with subsets and two variables polynomial function, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2019.10.003

3

P. Karthik, P. Shanthi Bala / Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx

Rivest and Schuldt (2016), had designed a stream cipher namely Spritz. The design was intended to eliminate the design flaws of the RC4 algorithm to develop its radical alternative in the perspective of security. The Spritz could be considered as a sponge function which absorbs input message and the required sequence of output bits would be squeezed later based on the necessity. Brown and Vanstone (2019) had proposed a random number generator using elliptic curves. Their design was based on public-key cryptography, and it employed the Weierstrass cubic equation to produce encryption keys. The design would use Escrow keys to get back the encryption keys if the keys used were lost. Burns et al. (2017) had proposed a secured two-party protocol using elliptic curves. Gilpin(2018) had designed a non-invertible compressive chaotic function using the characteristic signature of a stirring viscous fluid. Their design put-upon the chaotic hydrodynamics properties of viscous fluid to encode the information. The work had proved collision resistance properties using time scale-dependent kinetics of the viscous fluid. Kanso and Ghebleh (2015) had proposed a chaotic scheme based on the structure of the message to be encoded. Their design forms a 2D map employing the appearance of the bytes in the input string which would be later used to generate an arbitrary number. 3. Design principles The PSSPF uses subsets and polynomial function of degree 128. The polynomial function is chosen because it is a natural one-way function and decoding the polynomial from the inverse direction is typically impracticable. The task of discovering roots for the higher-order polynomial function has been until now a research problem and there is no standard solution available for solving higher-order polynomial equations (McNamee, 1993; Pan, 1997.). The task is significantly hardened by the use of two polynomial variables. In PSSPF, the polynomial function is applied on the subsets to produce intermediate hash values of 256-bits. The final hash value of 512-bits is produced by performing bit manipulation on intermediate hash outputs. The PSSPF function works on the following levels: 1. Subset Generation. 2. Generation of the intermediate hash output for the individual subsets. 3. Deriving the final hash through intermediate hash outputs. 3.1. Subset generation The first level of the hash generation process is the subset generation. The PSSPF uses subsets to enhance the security of the hash function. This is because an antagonist is currently forced to perform differential analysis on multiple subsets. The differential analysis of PSSPF output is significantly hardened by the near random responses of the subsets. The proposed design uses 3 subsets ˘ , Õ, and Ê respectively on the basis of index positions of the input A ˘ contains all S elements. The sets Õ and Ê are string S. The set A formed by collecting odd and even indices respectively from set S. The number of elements present in Õ and Ê is equal if the input string S possesses even number of elements. In contrast, if S possesses an odd number of elements, then set Ê would receive one element higher than set Õ. Let S be the input string with n ele˘ , Õ and Ê are formed by using the Eqs. (1)– ments. The subsets A (3) as adduced below:

A ¼ fXjA ¼ S g

ð1Þ

E ¼ fXjE  S; jEj ¼ n=2; E \ O ¼g

ð2Þ

O ¼ fXjO  S; jOj ¼ n=2; E \ O ¼g

ð3Þ

˘ , Õ and Ê are formed by employing The indices of the subsets A the formulae provided in the equations (4) to (6) as follows: ^

^

A ¼ fa1 ; a2 ; a3 ; a4 ; . . . ; an g then 8ai 2 A; ai ¼ a1 þ ði  1Þ wherea1 ¼ 0

ð4Þ

E ¼ fa1 ; a2 ; a3 ; a4 ; . . . ; an g then 8ai 2 E; ai ¼ a1 þ ði  1Þ  2 wherea1 ¼ 0

ð5Þ

e e ¼ fa1 ; a2 ; a3 ; a4 ; . . . ; an g then 8ai 2 O; O ai ¼ a1 þ ði  1Þ  2 wherea1 ¼ 1

ð6Þ

3.1.1. Message padding The message padding represents the first step of level 1 in the Hash procreative process. At this point, the individual subsets are processed as blocks of 1024-bits. If the size of a subset does not appear as multiples of the block size, then message padding would be applied. The PSSPF differs from conventional MD strengthening by not reserving the last 64/128 bits for storing the length of the input string. By contrast, it dynamically calculates and allocates the required number of bytes at runtime. The dynamic memory allocation for length attribute benefits the algorithm in two ways. Initially, it enables PSSPF to handle input message of any length. The conventional MD2, MD5, SHA-160, SHA-224/256, SHA-256, SHA-224/512, SHA-256/512, SHA-384/512, and SHA-512 algorithms fail to handle the input message when its size surpasses to 264/2128 bit. Next, it eliminates the processing of another block, when the size of the ultimate block outstrips 896-bits and 448-bits for the block sizes 1024-bits and 512-bits respectively. Fig. 2 illustrates the Message padding of PSSPF. The pseudo code that performs the message padding to the ultimate block of S is given as follows: Algorithm 1: Function MD-Strengthening (X: ARRAY[1. . .n]) returns byte array 1. // Calculate the padding length 2. padlen = 128-(n + len + 1) MOD 128; 3. //Calculate the new array length to store array elements X and padding information 4. newlen = n + 1 + len + padlen; // len specify the binary representation of n 5. //Define an array with required length to store the elements of X and padding information 6. db: ARRAY [1.. newlen]; 7. if(db.length % 128 ==0) then 8. // Copy the elements of X into db. 9. FOR I IN 1 to n 10. DO db[i] = x[i]; 11. END FOR; 12. DO db[n + 1] = 0x80; //Copy the value 0x80 at the end of the array 13. FOR I IN n + 2 to n + 1 + padlen 14. DO db[i] = 0; 15. END FOR; 16. // Copy the length of the data at the end of the db array. 17. DO db[n + 2 + padlen] = len; 18. ENDIF; 19. RETURN db; 20. END;

Please cite this article as: P. Karthik and P. Shanthi Bala, A new design paradigm for provably secure keyless hash function with subsets and two variables polynomial function, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2019.10.003

4

P. Karthik, P. Shanthi Bala / Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx

Fig. 2. MD-Strengthening of the proposed design.

3.1.2. Modifying the input elements through IVs (MIE). The MIE represents the second step of level 1 in the hash procreative process. In this step, the IVs are applied on the padded arrays of the subsets. The application of IVs on the subsets generates an opaque byte array to each subset. The PSSPF employs an array of 8 elements as IV for each and every subset. Each element of the vector is of 128-bits in size and the chain of all the vector elements would produce 1024-bits. The IV is modified for each and every block by performing circular shift rotations on IV at both array level and the elements by itself. The array elements of the IV are circularly rotated to one position on the right. The individual vector elements are circularly rotated to 5 bits on left by themselves. This process will be extended until all the blocks of the individual subset are processed. Fig. 3 presents the initial values of IVs applied to the individual subset. The opaque byte array produced by the subsets after the application of IVs in PSSPF is given in Fig. 4 for analysis. The output evidence that the message padding and IVs introduce unpredictable erratic behavior in the subsets. This would demand more efforts from a rival to perform differential analysis in PSSPF.

input string. However, the modification at the center of the input string would merely affect half of the bits on the LSB portion of the polynomial output. This problem could be rectified by using a wide pipe principle suggested by Lucks (2004). The PSSPF produces an output of size between 150 and 170 nibbles for an individual subset. The required output size of 256-bits is gained by taking 64 nibbles on either side from the mid position of the polynomial output. The intermediate hash output of each block is the XOR of two truncated halves. Fig. 5 shows the operating principle of the block iterated polynomial round function in PSSPF. If the input string contains multiple blocks, then the intermediate hash outputs of the blocks are XORed to produce a final hash value. The sample output of PSSPF is presented in Table 1 for analysis. Table 1 entries show that the proposed design rectifies the problem of a single variable polynomial function. The output values prove the proposed design responds constructively to a single bit flip at various positions of the input string. 3.3. Deriving the final hash output through the subsets This is the ultimate level of the hash procreative process. At this ˘ , Õ, and Ê are collected and point, the hash outputs of the subsets A their output bits are manipulated to produce a final hash output of ˘ , Õ, 512-bits. Let ß, U z , and Ɖ be the hash outputs of subsets like A and Ê respectively. The final hash Ƒ of the PSSPF is got from the chain of two halves H1 and H2 as given at the end.

H1 ¼ b  Uz

ð9Þ

H2 ¼ b  — D

ð10Þ

F ¼ H1jjH2

ð11Þ

3.2. Generation of the hash output for the individual subsets This is the second level of the hash procreative process. At this level, the subsets are applied to block iterated polynomial function to produce an intermediate hash output of size 256-bits. A Polynomial function of the form given in Eq. (7) could be used for the design of a block iterated round function. Conversely, the powers of the polynomial are decreasing towards the tail of the function. Because of this property, the first one-third of bytes in each block would affect the Most Significant Bits (MSB) of the hash output. Any modification happens in the trailing bits of the block would merely affect the Least Significant Bits (LSB). Therefore, the carving of hash output would make the hash function not to respond to the changes by trailing bits.

P ðxÞ ¼ P n xn þ P n1 xn1 þ P n2 xn2 þ . . . þ P n

ð7Þ

4. Experimental analysis The PSSPF is exhaustively tested for Collision resistance, Preimage resistance, Second Pre-image resistance, Non-correlation, and Avalanche properties. More than 24 million hashes have been used to inspect the aforesaid properties. The comparative analysis of the PSSPF on the provably secure properties is performed against the standard algorithms SHA-224, SHA-256, SHA-384, SHA-512, SHA3-224, SHA3-256, SHA3-384, SHA3-512, and Chaotic hash function (CHF-512). The algorithm is also subjected to runtime analysis and confusion and diffusion analysis with the aforesaid algorithms. These experiments were conducted using Intel(R) Core(TM) i3 6006U CPU @ 2.00 GHz 2.00 GHz processor with Windows-10 64 bits operating system. The algorithm was fully designed using JDK 10.0.1.

To address the above issue Eq. (7) is modified as follows:

P ðx; yÞ ¼ P n xn þ P n1 xn1 y þ P n2 xn2 y2 þ . . . þ P n yn

ð8Þ

The application of Eq. (8) in the design of RO function is twofold. Initially, it helps the PSSPF to spread the avalanche effect uniformly across the sheer length of the digest. Under this circumstance performing differential analysis on the digest would be exceedingly hard. Next, it would force a rival to find 128 coefficients to perform block-level attacks for every block to be processed. Therefore, the task of setting mathematical conditions to unveil the polynomial output of multiple subsets would be extremely hard. The values of x and y are chosen from two-digit prime numbers so that x # y. The difference between x and y ought to be the least to achieve the desired output. The modified Eq. (8) responds constructively to data modification on start and end positions of the

4.1. Analysis of collision and Pre-image resistance 4.1.1. Modifying the individual bytes for all possible values The PSSPF was exhaustively trialed for Collision resistance and Pre-image resistance properties till to the limits of the machine configuration. Every individual byte in the input string was changed for all possible values. The changes resulted in the hash outputs were recorded and compared with a reference hash value pertaining to the legitimate message. More than 4.5 million hashes were generated and tested against Collision resistance and Preimage resistance properties. The response of the PSSPF on collision and pre-image collision test is presented in Fig. 6. The result proves the PSSPF does not produce collisions. To perform a comparative analysis the standard cryptographic digest functions were also subjected to collision and pre-image resistance test by modifying

Please cite this article as: P. Karthik and P. Shanthi Bala, A new design paradigm for provably secure keyless hash function with subsets and two variables polynomial function, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2019.10.003

P. Karthik, P. Shanthi Bala / Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx

5

Fig. 3. The IVs used on Individual Subset in PSSPF.

Fig. 4. The sample opaque arrays generated by the subsets after the application of IV.

Fig. 5. Application of wide pipe principle for subset hash.

the individual bytes for all possible values. The result proves the responses of the standard cryptographic digest functions and the PSSPF are similar on collision and pre-image resistance test. 4.1.2. Test on collision and Pre-image resistance through all possible two bytes interchanges The PSSPF was again trailed for Collision and Pre-image collision test by exchanging every individual byte of the input string

with other bytes. The changes reflected in the hash outputs were compared with a reference value for the possible hash collision. More than 19.5 million hash values were generated and tested against the reference value. The response of the PSSPF on Collision and Pre-image collision test is presented in Fig. 7. The result proves the PSSPF provides formidable Collision resistance and Pre-image resistance properties. To perform a comparative analysis the standard cryptographic digest functions were also subjected to collision and pre-image resistance test by interchanging the individual input byte with other bytes. The result proves the standard cryptographic digest functions and the PSSPF exhibit similar behavior on collision and pre-image resistance test. 4.2. Analysis of second Pre-image resistance The Second Pre-image Resistance (SPIR) imposes a test on hash collision between any randomly selected two messages x and y

Table 1 Response of the Polynomial round function for a single bit flip at different locations. Input String: Received: from SLUAVA.slu.edu (sluvca.slu.edu) by Sierra.Stanford.EDU with SMTP (5.67b8/25-eef) id AA11430; Tue, 30 Aug 1994 Length is: 128 Bytes S.No

State on Data Modification

Output of the polynomial function(256 bits)

1 2 3 4

No modification Single bit flip at 0 th byte Single bit flip at 64 th byte Single bit flip at 127 th byte

2e0b53dae341224324ed0812021f3fb0857383c68b28ec1edd9c25e4e58b71f4 67caeaf8830d2d758531f62ad6b27bfd6708ec7124617b6388fc7c9fa6a8d329 fe3f7320e7bcb31d81824b8fa938a696eab1fe2a03140b84385d8f9b39640d1f 97f5e859d54e09c7a82c9d98a26e26e77c7296c26099ba429636049ef99d5cdf

Please cite this article as: P. Karthik and P. Shanthi Bala, A new design paradigm for provably secure keyless hash function with subsets and two variables polynomial function, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2019.10.003

6

P. Karthik, P. Shanthi Bala / Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx

Fig. 8. Graphical response of Second Pre-image resistance Test. Fig. 6. Graphical Response of Collision and Pre-image resistance for all possible values of the individual bytes.

binary level. Webster and Tavares (1985) had opined that a provably secure hash function would meet strict avalanche criteria to be considered for cryptographic use. Accordingly, a single bit-flip at the input would affect more than 50% of the output bits. To inspect this property, an input string of 128-bits was chosen, and it was randomly modified for a single bit-flip at the arbitrarily selected locations. The responses were recorded for analysis. Table 2 shows the sample response of the PSSPF for a single bitflip at distinct locations. Figs. 9 and 10 showcase the location history of the PSSPF on its avalanche response. Taking a step further, the PSSPF was compared with a chaotic digest function to analyze the confusion and diffusion properties. To perform this test a keyless chaotic digest function (CHF-512) was designed using the onedimensional polynomial based logistic map function as presented in Eq. (12).

X nþ1 ¼ bX n ð1  Xn Þ

Fig. 7. Graphical response of PSSPF on Collision and Pre-image resistance for all possible byte interchanges.

such that  # y. To conduct this experiment, 100,000 files were generated with varying file sizes between 1 and 100,000 bytes. Two differing arbitrary numbers i.e. A and B were generated to select any two files x and y from 100,000 sample files. The hash values H (x) and H (y) were compared for a hash collision. The experiment was conducted with more than 50,000 hash comparisons using different input strings of varying sizes. The graphical response of PSSPF on this test is presented in Fig. 8. The result proves the PSSPF exhibits a strong Second Pre-image resistance property. 4.3. Analysis of Non-correlation property The Non-correlation property was aimed to achieve erratic behavior in the hash output. Accordingly, a trivial change in the input string would affect a substantial number of output bits (Kam and Davida, 1979; Feistel, 1973). This property was considered as Avalanche Effect (AE). To investigate this property, the PSSPF was subjected to significant tests like Confusion and Diffusion analysis, avalanche effect and near collision effect. 4.3.1. Statistical analysis of confusion and diffusion The statistical analysis of confusion and diffusion was performed to analyze the erratic behavior of the hash output at the

ð12Þ

Baptista (1998), had proposed an encryption scheme that encrypted each character of the message using the logistic map function. The same model was later adopted by Teh et al. (2019) for the design of chaotic based keyed digest function. Table 3 shows the comparative analysis of PSSPF on Confusion and Diffusion against the standard algorithms and CHF-512 for 2 K data. The typical entry of Table 3 is derived from the mean of 500 samples. The comparative analyses of this test are presented in Figures between 11 and 17. The result proves the PSSPF meets the strict avalanche criteria. In addition, the proposed algorithm stands next to SHA2-224 algorithm and it outperforms the other standard algorithms on avalanche property. The analysis of the results presented in Figs. 9 and 10 prove the avalanche response of PSSPF is random and it happens to the sheer length of the hash output. Fig. 11 proves the average avalanche response of PSSPF is 50.21. This value is the second-best among the standard digest functions. Fig. 12 establishes the fact that the PSSPF outperforms the other 512-bits digest functions on avalanche property. Fig. 13 proves the average near-collision of PSSF is 49.79. This value is least among the other 512-bits digest functions. Fig. 14 presents the comparative analysis of PSSPF with standard algorithms through the confusion and diffusion analysis. Fig. 15 displays the comparative analysis of PSSPF with the other 512 bits algorithms. Fig. 16 proves the avalanche response of PSSPF for a single input bit-flip is 50.39. This value stands second among the other standard algorithms. Fig. 17 evidences the avalanche response for a single input byte-flip is 50.20. This value stands third-highest among the other standard digest functions. The results gathered from the above experiment reinforce the fact that the PSSPF exhibits excellent avalanche response than the conven-

Please cite this article as: P. Karthik and P. Shanthi Bala, A new design paradigm for provably secure keyless hash function with subsets and two variables polynomial function, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2019.10.003

S. No

Modifying Location

Input String

ive you the brewer all the information you need to strike off into unknown territory, and to help you become the ultimate br

Reference Hash (H)

cabe285327a1f336fa56275d259208600e22a872fbbd5c89a26766589819b8653f982b69d2a0a84433f26647b400e07500185b1424 b07e80459448a1e2c1b39f

Reference Hash(B)

1100101010111110001010000101001100100111101000011111001100110110111110100101011000100111010111010010 01011001001000001000011000000000111000100010101010000111001011111011101111010101110010001001101000100110011101100110010110001001 10000001100110111000011001010011111110011000001010110110100111010010101000001010100001000100001100111111001001100110010001111011 01000000000011100000011101010000000000011000010110110001010000100100101100000111111010000000010001011001010001001000101000011110 0010110000011011001110011111 b0cbc969a94829a17f28e5e883c1a670b5822218f331877bdf7d8935efe5361d5c49280690d c8f67a30754ea807a85e3844f774ec54de633084157d6c1dd440c 1011000011001011110010010110100110101001010010000010100110100001011111110010100011100101 11101000100000111100000110100110011100001011010110000010001000100001100011110011001100011000011101111011110111110111110110001001001101011110 1111111001010011011000011101010111000 100100100101000000001101001000011011100100011110110011110100011000001110101010011101010100 000000111101010000101111000111000010001001111011101110100111011000101010011011110011000110011000010000100000101010111110101101 1000001110111010100010000001100 d8699f7d08dc38f331f5aa55d7fc9e74ebc0f55bda364b538b97f9341c7991234cef0dac3d098431ee205b1c1cd30209cf6d 3b7395eed355645ebd165fd0db5f 11011000011010011001111101111101000010001101110000111000111100110011000111110101101010100 101010111010111111111001001111001110100111010111100000011110101010110111101101000110110 010010110101001110001011100101111111100100110100000111000111100110010001001000110100110 011101111000011011010110000111101000010011000010000110001111011100010000001011011000111 0000011100110100110000001000001001110011110110110100111011011100111001010111101110110100110101010101100100010111101011 11010001011001011111110100001101101101011111 e715424ca0d0df139785c7a35f7f4521974f651abd212b261f6513891f 898553c23d9e0a4b1b6ac8b6ead0b00b6497acdd02e4b4b17713ac5d88fbc271041f00 1110011100010101010000100100110010 10000011010000110111110001001110010111100001011100011110100011010111110111111101000 1010010000110010111010011110110010100011010101111010010000100101011001001100001111101100101000100111000100100011111 1000100110000101010100111100001000111101100111100000101001001011000110110110101011001000101101101110101011010000101 1000000001011011001001001011110101100110111010000001011100100101101001011000101110111000100111010110001011101100010 00111110111100001001110001000001000001111100000000

1

3

Modified Hash (H) Modified Hash(B)

2

65

Modified Hash(H) Modified Hash(B)

3

114

Modified Hash(H) Modified Hash(B)

Avalanche Response in %

51.04

52.08

54.68

P. Karthik, P. Shanthi Bala / Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx 7

Please cite this article as: P. Karthik and P. Shanthi Bala, A new design paradigm for provably secure keyless hash function with subsets and two variables polynomial function, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2019.10.003

Table 2 Sample response of PSSPF on Confusion and Diffusion analysis for a single bit flip at different locations.

8

P. Karthik, P. Shanthi Bala / Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx

Fig. 9. Location history of PSSPF on avalanche response. Fig. 11. Comparative Analysis of cryptographic digest functions on average avalanche response.

Fig. 10. Magnified view of the location history of PSSPF on avalanche response.

tional algorithms. Therefore, performing a differential attack on the PSSPF would continue to stay an extremely arduous task.

Fig. 12. Comparative Analysis of 512-bits digest functions on average avalanche response.

4.3.2. Analysis of the effect of avalanche on hash output To conduct this test an input string was randomly chosen from the pool of comparably sized messages. The input bits were randomly varied and their respective output nibbles were compared with a reference value. Table 4 demonstrates the effect of an avalanche on the PSSPF output nibbles. The typical entry of Table 4

is derived from the mean of 25 samples. The result indicates the average effect of the avalanche in PSSPF output nibbles is 93.81. It clearly proves the fact that the average avalanche response of 50.21% invariably affects the output bits of the PSSPF to its sheer length. The graphical response of this test is presented in Fig. 18. The comparative analysis of digest functions for the change of

Table 3 The comparative analysis of PSSPF on Confusion and Diffusion Analysis for 2 K data. S.No 1 2 3 4 5 6 7 8 9 10 11 12 13

Bits/Bytes

1 bit 2 bits 3 bits 4 bits 5 bits 6 bits 7 bits 1 byte 2 bytes 4 bytes 8 bytes 16 bytes 32 bytes Average Avalanche Response (%)

PSSPF-512

SHA2 224/256

SHA2-256

SHA2 384/512

SHA2-512

SHA3 224/512

SHA3 256/512

SHA3 384/512

SHA3-512

CHF-512

50.39 50.20 50.20 50.20 50.20 50.20 50.39 50.20 50.20 50.00 50.00 50.39 50.20 50.21

50.44 50.44 50 50 50.44 50.44 50.44 50.44 50.44 50.44 50 50 50 50.27

50.39 50.39 50.39 50 50.39 50 50.39 50 50 50.39 50 50.39 50 50.21

50 49.73 50.26 50 50 50 50 50.26 50 50 50.26 50.26 50 50.06

50 50.19 50 50.19 50 50 49.8 50 50.19 50 50 50 50 50.03

49.553 50 50 50 50 50 50 50 50 50 50 50 50 49.97

50 49.6 50 50 50 49.6 50 50 49.6 50 50 50 50 49.91

49.73 50 50.26 50 50 50 49.73 50 50 50.26 50.26 50 50 50.02

50 49.8 49.8 49.8 50 50 50 50 50 50.19 50 50.19 50 49.98

49.76 49.13 49.49 49.64 49.71 49.59 50.17 49.69 50.06 49.94 50.01 50.00 49.93 49.78

Please cite this article as: P. Karthik and P. Shanthi Bala, A new design paradigm for provably secure keyless hash function with subsets and two variables polynomial function, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2019.10.003

P. Karthik, P. Shanthi Bala / Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx

Fig. 13. Comparative Analysis of 512-bits digest functions on average near-collision response.

9

Fig. 16. The comparative analysis of Confusion and Diffusion for 1 bit-flip.

Fig. 17. The comparative analysis of Confusion and Diffusion for 1 byte-flip. Fig. 14. The comparative analysis of PSSPF on Avalanche Property through Confusion and Diffusion.

Fig. 15. The comparative analysis of Confusion and Diffusion on 512-bits algorithms.

output nibbles due to the avalanche effect is presented in Table 5. The entries of Table 5 are derived from the mean of 500 samples using 2 K data. Fig. 19 showcases the sample location history of matched nibbles for a single bit flip in the input message. It proves

only 8 among the 128 nibbles have been matched at distinct locations. Fig. 20 shows the comparative analysis of PSSPF on the average change of output nibbles due to the avalanche effect. The result proves the average change of output nibbles of PSSPF is 93.74. This value achieves the performance of SHA-2 and outperforms the other standard digest functions. However, this value is 0.07% lesser than the chaotic hash function. Fig. 21 presents the comparative analysis of the digest functions on the avalanche effect for the diversified input. Fig. 22 showcases the comparative analysis of 512-bits digest functions on this test for the diversified input at randomly selected locations. Fig. 23 presents the comparative analysis of the average change of output nibbles for the 512-bits digest functions. Fig. 24 presents the comparative analysis of this test for a single input bit-flip. The collected results prove the PSSPF consistently performs well in producing arbitrary output nibbles through the avalanche property. The entries of Table 5 establish the fact that the average change of output nibbles for the PSSPF due to the avalanche effect is 93.74%. This value stands second among the cryptographic digest functions. Astonishingly, the CHF-512 outperforms all the other cryptographic hash functions in this test. The avalanche response of CHF-512 at the binary level is least among the cryptographic digest functions. But, it performs well in grouping the binary bits to produce the distinct output nibbles. However, the results presented in Fig. 24 witnesses the reduced response of CHF-512 for a single bit flip in the input.

Please cite this article as: P. Karthik and P. Shanthi Bala, A new design paradigm for provably secure keyless hash function with subsets and two variables polynomial function, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2019.10.003

10

P. Karthik, P. Shanthi Bala / Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx

Table 4 Effect of Avalanche on PSSPF hash output. S.No

Data Size in Bytes

Effect of Avalanche in the PSSPF hash output Number of Bits/Bytes Changed

1 2 3 4 5 6 7 8 9 10 11 12

64 B 128 B 256 B 512 B 1k 2k 4k 8k 16 k 20 k 25 k >25 k Average %

1b

1B

2B

3B

4B

5B

6B

7B

8B

9B

10 B

>10 B

93.75 92.98 93.75 89.84 89.84 95.31 91.41 94.53 91.4 88.28 91.4 91.4 91.99

89.84 96.09 90.63 96.09 94.53 92.97 92.13 95.31 95.31 94.53 98.44 96.09 94.33

93.75 93.75 96.09 92.19 90.63 96.09 92.19 96.09 93.75 91.41 91.4 92.18 93.29

96.87 92.18 93.75 95.31 92.97 91.41 96.86 95.31 91.4 95.31 95.31 94.53 94.27

95.31 92.18 94.53 90.625 92.18 94.531 93.75 90.63 95.31 93.75 95.31 94.53 93.55

95.31 94.53 93.75 92.19 94.53 97.66 93.75 93.75 97.88 92.18 95.31 89.84 94.22

93.75 97.65 92.18 93.75 93.75 91.4 91.4 93.75 95.31 93.75 92.97 95.31 93.75

91.4 94.53 93.75 92.18 95.31 89.06 94.53 97.66 92.97 93.75 95.31 92.97 93.62

94.53 93.75 95.31 93.75 97.65 94.53 94.53 92.97 94.53 96.09 92.97 94.53 94.60

92.97 93.75 92.97 96.09 95.31 89.84 94.53 92.96 95.31 92.18 96.09 92.97 93.75

95.31 92.18 92.18 92.18 94.53 95.31 98.43 92.18 96.86 92.97 95.31 92.97 94.20

96.09 93.75 93.75 94.53 93.75 94.53 94.53 92.97 96.86 92.18 92.97 94.53 94.20

4.4.1. Statistical analysis of confusion and diffusion for near-collision resistance This test was performed to investigate the behavior of nearcollision on hash outputs at the binary level. To conduct this test an input string was randomly chosen and it was modified several times to produce different hash outputs. The outputs were compared with a reference value for possible matches at the binary level. Table 6 presents the comparative analysis of confusion and diffusion on near-collision. Table 7 presents the effect of nearcollision on the output nibbles for the digest functions. The entries of Tables 6 and 7 are derived from the mean of 500 samples for 2 K data. Fig. 25 shows the comparative analysis of PSSPF with standard digest functions on near-collision resistance. Fig. 26 presents the comparative analysis of this test for 512-bits digest functions. Fig. 27 displays the comparative analysis of the average nearcollision response of the PSSPF. The results prove the nearcollision response of the PSSPF is second least among the standard digest functions and least among the 512-bits digest functions. Fig. 28 presents the comparative analysis of PSSPF Vs SHA2-512 on near-collision response. Figs. 29 and 30 show the comparative analyses of this test for PSSPF Vs SHA3-512 and PSSPF Vs CHF512 respectively. The results prove the PSSPF outperforms SHA2512, SHA3-512, and the CHF-512 digest functions on nearcollision resistance.

Fig. 18. Graphical response of Avalanche effect.

4.4. Analysis of near-collision resistance The near-collision resistance was intended to check partial collisions among the hash values. Accordingly, for any two input strings x and y such that x # y, their hash values H(x) and H(y) were hardly differed by a limited number of bits. To inspect this property, three tests were conducted as given at the end.

4.4.2. The effect of near-collision on the output nibbles of PSSPF for randomly selected messages The effect of near-collision on the output nibbles of PSSPF was inspected with randomly selected input messages. Table 8 presents

Table 5 A Comparative analysis of digest functions on change of output nibbles due to avalanche effect using 2 K data. S. No

Bits/ Bytes

1 1 bit 2 2 bits 3 3 bits 4 4 bits 5 5 bits 6 6 bits 7 7 bits 8 1 byte 9 2 bytes 10 4 bytes 11 8 bytes 12 16 bytes 13 32 bytes Average Change in the output nibbles (%)

PSSPF512

SHA2 224/256

SHA2256

SHA2 384/512

SHA2512

SHA3 224/512

SHA3 256/512

SHA3 384/512

SHA3512

CHF512

93.70 93.82 93.76 93.77 93.81 93.89 93.68 93.65 93.85 93.69 93.83 93.50 93.66 93.74

93.08 93.19 93.22 93.11 92.99 92.98 92.91 93.34 92.75 93.22 92.96 92.79 93.39 93.07

93.88 93.52 93.57 93.79 93.49 93.62 93.58 93.63 93.89 93.87 93.5 93.77 93.73 93.68

93.71 93.78 93.69 93.79 93.89 93.67 93.56 93.63 93.73 93.52 93.87 93.57 93.71 93.70

93.89 93.88 93.66 93.7 93.77 93.78 93.63 93.77 93.6 93.85 93.71 93.82 93.53 93.74

93.91 93.89 93.52 93.77 93.45 93.88 93.37 93.7 93.67 93.71 93.86 93.58 93.66 93.69

93.66 93.67 93.75 93.61 93.5 93.46 93.71 93.85 93.76 93.81 93.76 93.63 93.52 93.67

93.75 93.72 93.56 93.69 93.84 93.47 93.61 93.62 93.69 93.54 93.59 93.85 93.76 93.67

93.69 93.75 93.69 93.71 93.63 93.69 93.45 93.83 93.75 93.73 93.8 93.75 93.56 93.69

93.26 93.16 94.07 93.76 94.3 93.96 93.9 94.16 93.84 93.93 93.69 93.79 93.67 93.81

Please cite this article as: P. Karthik and P. Shanthi Bala, A new design paradigm for provably secure keyless hash function with subsets and two variables polynomial function, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2019.10.003

P. Karthik, P. Shanthi Bala / Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx

11

Fig. 19. Location history of PSSPF for the matched nibbles with index positions. Fig. 22. Comparative analysis of 512-bits digest functions on the effect of avalanche at the output nibbles.

Fig. 20. A comparative analysis of digest functions on average change of output nibbles due to avalanche effect. Fig. 23. Comparative analysis of 512-bits digest functions on the average change of output nibbles.

Fig. 21. Comparative analysis of digest functions on the effect of Avalanche at the output nibbles. Fig. 24. Comparative analysis of digest functions on the change of output nibbles for a single input bit flip.

the near-collision effect on the output nibbles of the PSSPF. The typical entry of Table 8 is derived from the mean of 25 comparisons. Fig. 31 shows the graphical response of PSSPF for this test. The result proves the average value of the matched nibbles is 6.25. However, the average value of the unmatched nibbles increases to 121.75. The result evidences the approximate change of 50% of output bits for near-collision did not produce a similar

effect in the nibbles of the modified hash output. The table entries show that it is 4.88%. The analysis of the result proves two facts. In the first place, the changes in the output bits are uniform to its sheer length. Next, the intense avalanche in the modified output nibbles is due to the mismatch of consecutive four identical bits with the reference output bits.

Please cite this article as: P. Karthik and P. Shanthi Bala, A new design paradigm for provably secure keyless hash function with subsets and two variables polynomial function, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2019.10.003

12

P. Karthik, P. Shanthi Bala / Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx

Table 6 A comparative analysis of Confusion and Diffusion on near-collision resistance for 2 K data. S.No

Bits/Bytes

1 1 bit 2 2 bits 3 3 bits 4 4 bits 5 5 bits 6 6 bits 7 7 bits 8 1 byte 9 2 bytes 10 4 bytes 11 8 bytes 12 16 bytes 13 32 bytes Average Near Collision %

PSSPF

SHA2 224/256

SHA2-256

SHA2 384/512

SHA2-512

SHA3 224/512

SHA3 256/512

SHA3 384/512

SHA3-512

CHF-512

49.60 49.80 49.80 49.80 49.80 49.80 49.60 49.80 49.80 50 50 49.60 49.80 49.79

49.56 49.56 50 50 49.56 49.56 49.56 49.56 49.56 49.56 50 50 50 49.73

49.61 49.61 49.61 50 49.61 50 49.61 50 50 49.61 50 49.61 50 49.79

50 50.27 49.74 50 50 50 50 49.74 50 50 49.74 49.74 50 49.94

50 49.81 50 49.81 50 50 50.2 50 49.81 50 50 50 50 49.97

50.447 50 50 50 50 50 50 50 50 50 50 50 50 50.03

50 50.4 50 50 50 50.4 50 50 50.4 50 50 50 50 50.09

50.27 50 49.74 50 50 50 50.27 50 50 49.74 49.74 50 50 49.98

50 50.2 50.2 50.2 50 50 50 50 50 49.81 50 49.81 50 50.02

50.24 50.87 50.51 50.36 50.29 50.41 49.83 50.31 49.94 50.06 49.99 50.00 50.07 50.22

Table 7 The comparative analysis of near-collision effect with standard algorithms for 2 K data. S.No

Bits/Bytes

1 1 bit 2 2 bits 3 3 bits 4 4 bits 5 5 bits 6 6 bits 7 7 bits 8 1 byte 9 2 bytes 10 4 bytes 11 8 bytes 12 16 bytes 13 32 bytes Average match in the Output Nibbles (%)

NUMBER OF NIBBLES MATCHED IN THE OUTPUT PSSPF-512

SHA2 224/256

SHA2-256

SHA2 384/512

SHA2-512

SHA3 224/512

SHA3 256/512

SHA3 384/512

SHA3-512

CHF-512

6.30 6.18 6.24 6.23 6.19 6.11 6.32 6.35 6.15 6.31 6.17 6.50 6.34 6.26

6.92 6.81 6.78 6.89 7.01 7.02 7.09 6.66 7.25 6.78 7.04 7.21 6.61 6.93

6.12 6.48 6.43 6.21 6.51 6.38 6.42 6.37 6.11 6.13 6.50 6.23 6.27 6.32

6.29 6.22 6.31 6.21 6.11 6.33 6.44 6.37 6.27 6.48 6.13 6.43 6.29 6.30

6.11 6.12 6.34 6.30 6.23 6.22 6.37 6.23 6.40 6.15 6.29 6.18 6.47 6.26

6.09 6.11 6.48 6.23 6.55 6.12 6.63 6.30 6.33 6.29 6.14 6.42 6.34 6.31

6.34 6.33 6.25 6.39 6.50 6.54 6.29 6.15 6.24 6.19 6.24 6.37 6.48 6.33

6.25 6.28 6.44 6.31 6.16 6.53 6.39 6.38 6.31 6.46 6.41 6.15 6.24 6.33

6.31 6.25 6.31 6.29 6.37 6.31 6.55 6.17 6.25 6.27 6.20 6.25 6.44 6.31

6.74 6.84 5.93 6.24 5.70 6.04 6.10 5.84 6.16 6.07 6.31 6.21 6.33 6.19

Fig. 25. A comparative analysis of Confusion and Diffusion on near-collision response for 2 K data.

Fig. 26. A comparative analysis of 512-bits Digest functions on near-Collision response for 2 K data.

4.4.3. The near-collision effect on PSSPF output nibbles for randomly varied input bits/bytes To conduct this test, a sample input string was taken and it was varied randomly for a minimal change as presented in Table 9. The entries of the table are derived from the mean of 25 samples. The behavior of the output nibbles of PSSPF was analyzed for Near-

collision effect. Fig. 32 presents the graphical response of the output nibbles for Near-collision effect. The result proves the average value of the matched nibbles is 6.42. The average change of output nibbles due to the avalanche effect is 94.99%. This value stands ahead of the conventional digest functions. Therefore, the PSSPF exhibits stiff resistance to Near-collision.

Please cite this article as: P. Karthik and P. Shanthi Bala, A new design paradigm for provably secure keyless hash function with subsets and two variables polynomial function, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2019.10.003

P. Karthik, P. Shanthi Bala / Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx

Fig. 27. A comparative analysis of the Digest functions on average near-collision response.

Fig. 28. Near-collision response: PSSPF-512 Vs SHA2-512.

Fig. 29. Near-collision response: PSSPF-512 Vs SHA3-512.

4.5. Analysis of runtime response The PSSPF-512 was compared with the standard digest functions for the runtime performance. Table 10 presents a comparative analysis of the runtime performance of various hash functions. The typical Table 10 entry is derived from the mean of 25 samples. Fig. 33 shows a comparative analysis of the runtime performance of standard digest functions. Fig. 34 presents a comparative analysis of 512-bits digest functions. The result shows

13

Fig. 30. Near-collision response: PSSPF-512 Vs CHF-512.

the runtime response of PSSPF linearly varies with the message size. This happens because the increase in the message size would increase the number of blocks to be processed. Therefore, this would demand more clock cycles from the CPU to process the distinct block using the polynomial function. The table entries witnesses the chaotic hash function is operating very slow to produce message digests. The average runtime response proves the PSSPF operates 6.598 times faster than CHF512 for the data size less than or equal to 20 K. But for the 20 K data, the PSSPF operates 9.58 times faster than CHF-512. This happens because the linear logistic map function when iterated n times would produce a polynomial function of degree n. This is the reason for the reduced performance of the chaotic function. But in the proposed design, the two variables polynomial function is applied to the message blocks rather than the cells. Therefore, the PSSPF operates faster than the CHF-512 digest function. However, the result proves both PSSPF-512 and CHF-512 operates much slower than the standard digest functions.

4.5.1. Cost-benefit analysis The cost of the system is typically derived from the memory necessity and clock cycles. The PSSPF would demand spare memory of 36, 32 and 16 bytes when it is considered as an alternative to 224, 256 and 384-bits digest functions respectively. Similar way, it would not salvage memory when considered against the 512-bits digest functions. However, this memory size is remarkably belittled and therefore could be neglected for the costbenefit analysis. The clock cycles needed for producing a message digest employing a 4GHZ processor is presented in Table 11. The speed of the processor is 4GHZ. Therefore, the processor would produce 4,000,000,000 clock cycles per second and 4,000,000 clock cycles per Milliseconds. Table 11 entries are derived from the runtime response of digest functions for an input data of size 20 K. The result proves the clock cycles needed for the PSSPF to produce a message digest is 6,79,840,000. This value stands secondhighest among the cryptographic digest functions. However, this value is 9.58 times lower than the CHF-512 and 55.18 times higher than SHA3-512. The SHA2-512 eats up least clock cycles among the 512-bits algorithms, and the value is 137.04 times smaller than the PSSPF-512. The cost analysis proves both PSSPF-512 and CHF-512 demand substantial computational power for producing a message digest. However, decoding the PSSPF-512 and the CHF-512 remains a computationally infeasible task and would demand an arduous effort from a cryptanalyst. The families of SHA2 and SHA3 are efficient in the perspective of the quick response time. But the partial breaking of families of SHA2 and SHA3 proves that these algo-

Please cite this article as: P. Karthik and P. Shanthi Bala, A new design paradigm for provably secure keyless hash function with subsets and two variables polynomial function, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2019.10.003

14

P. Karthik, P. Shanthi Bala / Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx

Table 8 PSSPF response on near-collision resistance for randomly selected messages. S. No

Data Size in Bytes

1 64 2 128 3 256 4 512 5 1k 6 2k 7 4k 8 8k 9 16 k 10 20 k 11 25 k 12 >25 k Average Value

No of Nibbles Matched

Effect of Near-Collision on Hash nibbles (%)

No of Nibbles not Matched

Avalanche % on Nibbles

8 7 7 9 8 6 8 3 4 5 6 4 6.25

6.25 5.47 5.47 7.03 6.25 4.69 6.25 2.34 3.13 3.91 4.69 3.13 4.88

120 121 121 119 120 122 120 125 124 123 122 124 121.75

93.75 94.53 94.53 92.97 93.75 95.31 93.75 97.66 96.88 96.09 95.31 96.88 95.12

Findings

PSSPF meets the Strict Avalanche Criterion and it responses well for Near-Collisions

1/2512. However, the value is very small and therefore it could be neglected. Therefore performing generic attacks against the PSSPF is exceedingly hard. The hash function takes an arbitrary length input string to produce fixed-length output. However, the argument is valid only when the input size is limited to 264 or 2128 bits. This happens because the standard algorithms employ a static reservation policy for storing the length attribute of the input string. The conventional keyless hash functions like MD2, MD5, SHA-160, SHA224/256, SHA-256, SHA-224/512, SHA-256/512, SHA-384/512, and SHA-512 reserve the last 64/128 bits for storing the length attribute of the input string. This design paradigm introduces the following concerns.

Fig. 31. Graphical representation of near-collision resistance for randomly selected messages.

rithms are vulnerable to crypto attacks. Therefore, they suffer from the perspective of security.

5. Discussion The PSSPF is a 512-bits algorithm. Therefore, it would demand an antagonist to perform 2512 worst case comparisons to detect a collision using Brute-force. Similar way, the birthday attack would need 2256 worst case comparisons to detect a collision against the PSSPF. However, the aforesaid tasks are computationally infeasible to perform in reality. The probability for a random attack would be

 Processing the input string of length greater than 264/2128 bit is hard.  These hash functions have to process another block when the size of the data in the ultimate block surpasses 448-bits and 896-bits for the block sizes 512-bits and 1024-bits respectively. The block iterated hash functions MD2, MD5, SHA-160, and SHA2 family used bitwise AND, OR, XOR and MOD operators for processing the inputs at the block level. The SHA3 family used XOR, MOD, CEIL(x), Log2(x), and Truncs(x) operators/operations to process the inputs at the block level. However, the breaking/partial breaking of MD5, SHA1, SHA2 family and SHA3 family raise concern about the use of these operators/operations at the block level (Yu and Wang, 2009; Dinur et al., 2013). In contrast, the PSSPF addresses acute concerns over its newly designed paradigm. The padding rule is subtly modified to eliminate the shortfall of conventional design. By choice, it allocates the required space for length attribute during runtime. This would enable the PSSPF to process any message without length restric-

Table 9 Response of PSSPF on the effect of near-collision. S. No 1 2 3 4 5 6 7 8 9 10 11 12

Data Size in Bytes

No of Bits/Bytes Changed

64 128 256 512 1k 2k 4k 8k 16 k 20 k 40 k >40 k Average Value

1b 2b 4b 1B 2B 4B 8B 16 B 32 B 64 B 128 B 256 B

No of Nibbles Matched

Similarity % Observed

No of Nibbles not Matched

Avalanche % on Nibbles

7 3 8 6 9 3 7 7 5 8 7 7 6.42

5.47 2.34 6.25 4.69 7.03 2.34 5.47 5.47 3.91 6.25 5.47 5.47 5.01

121 125 120 122 119 125 121 121 123 120 121 121 121.58

94.53 97.66 93.75 95.31 92.97 97.66 94.53 94.53 96.09 93.75 94.53 94.53 94.99

Findings

The PSSPF exhibits stiff resistance to Near Collisions

Please cite this article as: P. Karthik and P. Shanthi Bala, A new design paradigm for provably secure keyless hash function with subsets and two variables polynomial function, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2019.10.003

15

P. Karthik, P. Shanthi Bala / Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx

Fig. 32. Graphical response on the effect of near-collision.

tions. Likewise, the PSSPF minimally uses XOR operator and it forbids the use of other bitwise operators. It employs higher-order polynomial function at the block level to produce intermediate hash output. Decoding the polynomial function of degree 128 from inverse direction remains a computationally infeasible task. Therefore, the contemporary design would improve the security of the PSSPF. The experimental result on confusion and diffusion proves the PSSPF meets the strict avalanche criteria. The average avalanche achieved is 50.21 which is the second-highest among the other standard digest functions. However, the result proves the PSSPF outperforms the standard 512-bits algorithm on avalanche property. The experimental result on near-collision proves the average value of the near-collision of PSSPF is 49.79 which is the secondlast among the other standard algorithms. But in the 512-bits category, the PSSPF performs exceedingly well on near-collision resistance. The experimental result of the near-collision effect on hash output proves the average matched nibbles for PSSPF is 6.26. This value is the second-last among the other hash functions. These results prove the PSSPF performs remarkably well on its erratic behavior, and the effect of the avalanche is uniform to the sheer length of the PSSPF digests. Fig. 35 presents the distribution ASCII values of the sample input data given to the PSSPF. The sample distribution of hexadecimal digits on the PSSPF output is presented in

Fig. 36. The result witnesses the selection, and distribution of hexadecimal digits are random. Therefore, performing a differential attack on the PSSPF would be unusually hard. The PSSPF focuses on security rather than performance. The runtime analysis shows that the response time of the algorithm linearly varies with the size of the input string. This happens because of the application of higher-order polynomial function and the deployment of subsets. However, the result of the runtime analysis shows reasonable response time for limited data. Considering the growth of hardware design as suggested by Schaller (1997), the delay is acceptable for the present and future computing systems. Therefore, the proposed system can be considered for security-centric cryptographic applications. The Elliptic curves and chaotic functions could be considered as the alternatives for the design of the hash function in the perspective of security (Icart and Coron, 2018; Goldberg et al., 2018; Guesmi et al., 2016.). The elliptic curve cryptography operates on the principle of public cryptography, and they are typically employed for applications like the generation of MAC and digital signatures. The experimental results prove the Chaotic hash function exhibits poor avalanche response on avalanche and nearcollision resistance at the binary level and also it operates very slow than the other digest functions. The application of chaotic hash function demands more clock cycles even for the limited data. Therefore, the PSSPF-512 could be considered as the more effective alternative in the perspective of security. The strength of the proposed system could be summarized as follows. 1. The PSSPF could be able to process any input message without length restrictions. The size of the digest naturally prevents brute-force and birthday attacks. 2. The application of polynomial function would prevent the cryptanalyst to perform block-level attacks. 3. The experimental results prove the PSSPF outperforms the other standard digest functions on confusion and diffusion, avalanche and near-collision properties. Therefore, performing a differential attack on the PSSPF would continue to stay an arduous task. 4. The partial breaking of families of SHA-2 and SHA-3 function and the breaking of MD5 and SHA-160 algorithms raise concerns about the use of bitwise operators in the design of hash function. But in the PSSPF the hash output is produced through the polynomial powers. Therefore, the proposed design natu-

Table 10 Runtime response of the digest functions. S.No

Runtime Comparison of Keyless Hash Function in Milliseconds Data Size in Bytes

1 64 B 2 128 B 3 256 B 4 512 B 5 1K 6 2K 7 3K 8 4K 9 5K 10 6K 11 7K 12 8K 13 9K 14 10 K 15 16 K 16 20 K Average runtime (data <20 K)

SHA-2 Family

SHA-3 Family

SHA-224

SHA-256

SHA-384

SHA2-512

SHA3-224

SHA3-256

SHA3-384

SHA3-512

1.2 1.8 1.12 1.08 0.64 1.04 1.48 1.32 2.48 1.6 2.52 1.88 1.24 1.84 1.12 1.24 1.475

0 0 0.2 0.2 0 0.56 0.76 0.8 0.6 0.64 0.12 0.8 1.2 0.8 1.4 0.08 0.51

0 0.64 0.44 0.32 0.64 0.6 0.44 0.48 0.6 0.6 0.64 0.48 0.64 0.52 0.72 0.68 0.5275

0.6 0 0.24 0.16 1.24 0.36 0.56 0.48 0 0.44 0.48 0.48 0.6 0.92 0.44 1.24 0.515

11.88 11.84 12.32 12.08 11.24 11.76 12.32 12.2 12.64 11.64 10.6 12.52 12.6 9.48 10 13.4 11.7825

0.64 0 0.6 0 0.6 0.56 0.24 0.44 0.44 0.44 0 0.32 0.68 0.68 1.88 1.64 0.5725

0 0 0.6 0 0 0 0.28 0.44 0.36 0.36 1.56 0.48 0.84 0.72 2.52 1.08 0.5775

0.64 0.64 0 0 0.6 0 0.48 0.68 0.28 0 0.8 0.64 0.08 0 3.8 3.08 0.7325

PSSPF-512

CHF-512

10.6 13.6 19.4 27.72 30.72 44.32 57.4 70.92 79.2 92.08 101.68 112.88 120.52 118.6 158.24 169.96 76.74

27.12 49.16 78.72 102.4 150.6 237.12 310.68 404.04 472.48 513.72 581.4 675.36 734.16 810.36 1326.84 1628.32 506.405

Please cite this article as: P. Karthik and P. Shanthi Bala, A new design paradigm for provably secure keyless hash function with subsets and two variables polynomial function, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2019.10.003

16

P. Karthik, P. Shanthi Bala / Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx

Fig. 35. PSSPF-Input Distribution. Fig. 33. Comparative analysis of runtime response of Digest functions.

Fig 34. . Comparative analysis of runtime response of 512-bits Digest functions.

Table 11 Clock cycle requirements for the digest functions. S. No

Name of the Algorithm

Runtime in Milli Seconds

No.Of Clock cycles Consumed

1 2 3 4 5 6 7 8 9 10

SHA-224 SHA-256 SHA-384 SHA-512 SHA3-224 SHA3-256 SHA3-384 SHA3-512 PSSPF-512 CHF-512

1.24 0.08 0.68 1.24 13.4 1.64 1.08 3.08 169.96 1628.32

4960000 320000 2720000 4960000 53600000 6560000 4320000 12320000 679840000 6513280000

rally demands more clock cycles than the standard algorithms. At the same time, decoding/breaking the PSSPF would demand most grueling efforts from the cryptanalyst. 5. The application of Chaotic functions could be considered as an alternative for the design of a hash function in the perspective of security. But the results prove the chaotic hash function performs poorly on the avalanche, confusion and diffusion, and near-collision properties. The runtime response is also unsatisfactory for the chaotic hash function. Therefore, the PSSPF could be considered as a suitable candidate for cryptographic applications in the perspective of security. 6. Conclusion and future enhancements The proposed design PSSPF addresses the structural design flaw of the conventional hash algorithms in perspective of security. The

Fig. 36. Distribution of Hex Values in the PSSPF output.

padding rule is modified to process the input string of any arbitrary length. The proposed design uses a higher-order polynomial function and the subsets to generate hash output. The provably secure properties of the proposed design were experimentally inspected with more than 24 Million hash searches. The result proves the PSSPF is a provably secure hash function. The use of two polynomial variables and the higher-powers of the order of 128 produce an unpredictable sequence of stray bits in the hash output. The use of subsets further increases erratic behavior. The statistical analysis of confusion and diffusion proves the PSSPF meets the strict avalanche criteria. The result of the non-correlation and near-collision proves the PSSPF outperforms the other standard algorithms in its class. The analysis of avalanche property proves the breaking of PSSPF is extremely hard. The proposed system neither allows a contender to perform differential analysis nor allow him to set any mathematical conditions. This would enable the system to stand strongly against the differential and block-level attacks. In addition, the experimental result witnesses the avalanche response of CHF-512 is 49.78. This value stands least among the other digest functions. The near-collision response of the CHF512 is 50.22. This value stands the highest among the other digest functions. The application of chaotic hash function demands more clock cycles even for the limited data. Therefore, the PSSPF-512 could be considered as the more effective alternative than the standard digest functions and the CHF-512 in the perspective of security. References Alfred, J., van Menezes, C., Paul, Oorschot, Vanstone Scott, A., 1996. In: Handbook of Applied Cryptography. Massachusetts Institute of Technology, p. 51.

Please cite this article as: P. Karthik and P. Shanthi Bala, A new design paradigm for provably secure keyless hash function with subsets and two variables polynomial function, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2019.10.003

P. Karthik, P. Shanthi Bala / Journal of King Saud University – Computer and Information Sciences xxx (xxxx) xxx Al-Kuwari, Saif, Davenport, James H., Bradford, Russell J., 2010. Cryptographic hash functions: recent design trends and security notions. 133–150. Applebaum, Benny et al., 2017. Low-complexity cryptographic hash functions. 8th Innovations in Theoretical Computer Science Conference (ITCS 2017). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik. Baptista, M., 1998. Cryptography with chaos. Phys. Lett. A 240 (1), 50–54. Bartkewitz, Timo, 2009. Building Hash Functions from Block Ciphers, their Security and Implementation Properties. Ruhr-University Bochum. Bertoni, Guido et al., 2013. Keccak. Annual International Conference on the Theory and Applications of Cryptographic Techniques. Springer, Berlin, Heidelberg. Bitansky, Nir, Kalai, Yael Tauman, Paneth, Omer, 2018. Multi-collision resistance: a paradigm for keyless hash functions. Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing. BROWN, Daniel Richard L., Vanstone, Scott Alexander, 2019. Elliptic curve random number generation. U.S. Patent Application No. 10/243,734. Buchmann, Johannes A., 2004. Cryptographic hash functions. In: Introduction to Cryptography. Springer, New York, NY, pp. 235–248. Burns, Jonathan et al., 2017. EC-OPRF: oblivious pseudorandom functions using elliptic curves. IACR Cryptol., 111 Damgard, Ivan Bjerre, 1989. A design principle for hash functions. Conference on the Theory and Application of Cryptology. Springer, New York, NY. Davies, R.W., Price, W.L., 1984. Digital signature - an update. In: Proc. International Conference on Computer Communications, Sydney. Elsevier, North-Holland, pp. 843–847. iss5. Dinur, Itai, Dunkelman, Orr, Shamir, Adi, 2013. Collision attacks on up to 5 rounds of SHA-3 using generalized internal differentials. International Workshop on Fast Software Encryption. Springer, Berlin, Heidelberg. Dworkin, Morris J., 2015. SHA-3 standard: Permutation-based hash and extendableoutput functions. No. Federal Inf. Process. Stds.(NIST FIPS)-202. Eastlake, D. 3rd, Jones, Paul. US secure hash algorithm 1 (SHA1). No. RFC 3174. 2001. Feistel, Horst, 1973. Cryptography and computer privacy. Sci. Am. 228 (5), 15–23. Gilpin, William, 2018. Cryptographic hashing using chaotic hydrodynamics. Proc. Natl. Acad. Sci. 115 (19), 4869–4874. Goldberg, Sharon, et al., 2018. Verifiable random functions (VRFs). Guesmi, Ramzi et al., 2016. A novel chaos-based image encryption using DNA sequence operation and Secure Hash Algorithm SHA-2. Nonlinear Dyn. 83 (3), 1123–1136. Icart, Thomas, and Jean-Sebastien Coron. ‘‘Cryptography on an elliptical curve.” U.S. Patent Application No. 10/027,483. Kam, John B., Davida, George I., 1979. Structured design of substitutionpermutation encryption networks. IEEE Trans. Comput. 10, 747–753. Kanso, A., Ghebleh, M., 2015. A structure-based chaotic hashing scheme. Nonlinear Dyn. 81 (1-2), 27–40. Lai, Xucjia, Massey, James L., 1992. Hash functions based on block ciphers. Workshop on the Theory and Application of of Cryptographic Techniques. Springer, Berlin, Heidelberg.

17

Lucks, Stefan, 2004. Design principles for iterated hash functions. IACR Cryptol. 2004, 253. Matyas, S.M., Meyer, C.H., Oseas, J., 1985. Generating strong one-way functions with cryptographic algorithm. IBM Techn. Disclosure Bull. 27 (10A), 5658–5659. McNamee, John Michael, 1993. A bibliography on roots of polynomials. J. Comput. Appl. Math. 47 (3), 391–394. Merkle, Ralph C., 1989. One way hash functions and DES. Conference on the Theory and Application of Cryptology. Springer, New York, NY. NIST, 2012. Selects winner of secure hash algorithm (SHA-3) competition, http:// www.nist.gov/itl/csd/sha-100212.cfm. NIST2, 2012. SHA-3 the new standard for cryptographic function, https://csrc. nist.gov/Projects/Hash-Functions/SHA-3-Project/SHA-3-Standardization. Pan, Victor Y., 1997. Solving a polynomial equation: some history and recent progress. SIAM Rev. 39 (2), 187–220. Preneel, Bart, Govaerts, René, Vandewalle, Joos, 1993. Cryptographic hash functions: an overview. Proceedings of the 6th International Computer Security and Virus Conference (ICSVC 1993). Rivest, Ronald L., Schuldt, Jacob CN, . Spritz-a spongy RC4-like stream cipher and hash function. IACR Cryptol., 856 Rivest, Ronald, 19921. The MD4 message-digest algorithm. No. RFC 1320. Rivest, Ronald, 19922 The MD5 message-digest algorithm. No. RFC 1321 Schaller, Robert R., 1997. Moore’s law: past, present and future. IEEE Spectr. 34 (6), 52–59. Teh, Je Sen, Tan, Kaijun, Alawida, Moatsum, 2019. A chaos-based keyed hash function based on fixed point representation. Cluster Comput. 22 (2), 649–660. Thomsen, Søren Steffen, Knudsen, Lars Ramkilde, 2005. Cryptographic hash functions Diss. PhD thesis. Technical University of Denmark. Trybulec, Wojciech A., 1990. Pigeon hole principle. J. Formalized Mathematics 2 (199). Wang, Xiaoyun, Feng, Dengguo, Lai, Xuejia, Hongbo, Yu., 2004. Collisions for hash functions MD4, MD5, HAVAL-128 and RIPEMD. IACR Cryptol. 2004, 199. Wang, Xiaoyun, Yin, Yiqun Lisa, Hongbo, Yu, 2005. Finding collisions in the full SHA1. Annual International Cryptology Conference. Springer, Berlin, Heidelberg. Webster, A.F., Tavares, Stafford E., 1985. On the design of S-boxes. Conference on the Theory and Application of Cryptographic Techniques. Springer, Berlin, Heidelberg. Xie, Tao, Feng, Dengguo, 2010. Construct MD5 collisions using just a single block of message. IACR Cryptol. 2010, 643. Yang, Yijun et al., 2019. A secure hash function based on feedback iterative structure. Enterprise Inf. Syst., 1–22 Yu, Hongbo, Wang, Xiaoyun, 2009. Near-collision attack on the compression function of dynamic SHA2. IACR Cryptol. 2009, 179.

Please cite this article as: P. Karthik and P. Shanthi Bala, A new design paradigm for provably secure keyless hash function with subsets and two variables polynomial function, Journal of King Saud University – Computer and Information Sciences, https://doi.org/10.1016/j.jksuci.2019.10.003