Whitepaper  Pseudo Random Number Generators
ABSTRACT
Random numbers are used for many cryptographical purposes, but people do not pay enough attention to creating secure random numbers. Random numbers that can be guessed can break the whole crypto system, and are therefore very important factors. At first there is talk about pseudorandom numbers and generators and where they are used. First section also includes discussion about algorithms and seeding. Cryptographic pseudorandom number generators and attacks against them are also presented generally. Analyzing part includes introductions to random number generator tests such as Diehard, Nist's test suite, ENT and Michal Zalewski's Attractor analysis tool. Data, which came with attractor analysis tool, was also tested with ENT and results are presented at the end of this paper.
Introduction
 Random numbers are used for many cryptographical purposes like generation of session keys, initialization vectors and creation of public keys. The security of cryptographic systems depends on the generation of unpredictable numbers. Usually people do not pay enough attention to creating secure random numbers. Random numbers that can be guessed, will become system's weakest point and break the hole crypto system. Algorithms and protocols can't cover for bad random numbers.
 Pseudorandom number generator is a mechanism that produces randomlike numbers. Numbers are called the pseudorandom numbers, because truly random numbers come from nondeterministic sources like radioactive materials and atmospheric noise. Computers are designed to execute strictly defined sets of commands in repeatable and accurate ways. Thus, every fixed algorithm can be used to produce exactly the same results on another computer, which then can be used to effectively predict output values, assuming attacker can reconstruct internal state of such a remote system. Also even if the target pseudorandom number generator function is not known, sooner or later the algorithm will start generating the exact same sequence over again because there is a limited number of possible internal states that can be used by specific algorithm. PRNGs are typically constructed from other cryptographic primitives, such as block ciphers, hash functions, and stream ciphers.
 A good pseudorandom number generator will produce a sequence of numbers that can not be easily guessed or determined by an adversary. Cryptographically secure generators should produce numbers, that could not be predicted, thought an attacker would know which algorithm generator uses. A good generator also has to have good statistical properties, long period, efficient implementation and it is reproducible and portable.
 PRNG needs a seed, or a key, as a input and then produces a stream of bits or numbers determined completely by the seed. This why seeding is very important when PRNG is used for cryptographical purposes. Internally, a PRNG needs to have a mechanism for processing unpredictable samples, a mechanism for using those samples to update its internal state or its seed, and a mechanism to use some part of its internal state or its seed to generate pseudorandom outputs. In some PRNG designs, the same mechanism does all three tasks and in others, the mechanisms are clearly separated.
Algorithms
 There exists many algorithms for generating pseudorandom numbers. The linear congruential, linear feedback shift register, and related generators are not good enough for cryptographical purposes but with care can be used for simulation. Some related generators called lagged Fibonacci generators have extremely long periods but are nevertheless insecure. By contrast, the BlumBlumShub generator is believed to be secure, since it's security is provably equivalent to the existence of a fast algorithm for quadratic residues modulo a composite number whose factorization is not known, which is believed to be hard.
 The linear congruential generators will be maximal period generators, if increment, multiplier and modulus are properly chosen. They are also fast, but unfortunately totally predictable.
 Linear feedback shift registers are easily implemented in digital hardware and popular because of that, but their sequential bits are linear, which makes them useless for encryption.
Seeding
 Information collected for the seed does not need to be truly random, but unguessable and unpredictable. Some PRNGs, that are software based, use hardware based RNGs to generate the seeds for PRNGs. Typical seeds such as current time, number of clock ticks since reboot, process ID or user input, produce only a limited number of random bits and are therefore insecure.
 Some protocols are such that they can "leak" some details of the PRNG state. For example linear congruential generators essentially give their entire internal state with every output. Given the deterministic nature of the PRNG this can compromise security unless reseeding is carried out appropriately. Even cryptographically secure algorithms like BlumBlumShub fail in pseudorandom number generation if the seed is known.
Cryptographic PRNGs
 A good PRNG is designed in such a way that given enough entropy, it keeps producing numbers that are hard to guess, even if an attacker knows the full details of the algorithm.
 One cryptographic PRNG is BlumBlumShub. Its security is based on a simple mathematical principle: factoring a composite number N=pq, where p and q are large unknown primes. Other cryptographic PRNGs tend to base their security on block ciphers or hash functions. They are much more faster but it is harder to analyze their security accurately.
 Another cryptographic PRNG is Tiny, which is a simple evolution of the Yarrow PRNG. It works by encrypting a counter using AES to produce successive blocks of output. It is believed to be secure against all known cryptographic attacks [1].
Yarrow and Tiny
Yarrow [2] rely on oneway hash function SHA1 and block cipher threekey tripleDES. There are four major components: an entropy accumulator which collects samples in two SHA1 pools from entropy sources, a reseed mechanism which periodically reseeds the 160bit key with new entropy from the pools, a generation mechanism which generates PRNG outputs from the key using tripleDES and a reseed control that determines when a reseed is to be performed.
 Generation mechanism: Needs nbit counter value C. To generate the next nbit output block R, increment C and encrypt it with the block cipher E_K() using the key K.
C <  (C+1) mod 2^n R <  E_K(C)
 Tiny [1] is a simple evolution of the Yarrow PRNG and also facilities for dealing with entropy. Tiny works by encrypting a counter using AES to produce successive blocks of output.The key and the initial state are derived from the seed data. Tiny uses few blocks of output left from user's request to reseed itself.
BlumBlumShub
 BlumBlumShub generator is provably secure assuming that it is hard to factor large numbers into primes.
 Choose two large secret primes p and q, both congruent to 3 mod 4. Compute product n=pq. For given seed s_0 compute sequence s_0,s_1,... by recursive formula s_{i+1}=s_i^2 modulo n. From this sequence of numbers in the range 0,1,...,n1 create a sequence of pseudorandom bits b_i=s_1 modulo 2. [3]
 Picking up prime numbers similar size as RSA keys are means that you will almost certainly be doing modular arithmetic with software library, because your internal state will be too big to be a primitive type. This makes the BlumBlumShub algorithm impractical for many uses, as it is excruciatingly slow in practice. [1]
 At higher level there are two kind of attack types against PRNG which has high quality seed. The first attack type is cryptanalytic attack and the second is attack against the internal state. Attacker prefer to attack the PRNG rather then computationally intense attacks such as computing inverse logarithms and factoring large numbers.
 Direct cryptanalytic attack is any attack that can distinguish generator output from truly random data. This kind of attack is applicable to most uses of PRNGs. If PRNG is passed statistical tests it is likely that it will not be vulnerable to cryptanalytic attacks.
 Input based attack occurs when an attacker is able to control or use the knowledge of the PRNG's inputs to cryptanalyze the PRNG. This kind of attack type can further be divided into known input, replayed input and chosen input attacks. Using many sources of entropy in the seed values can minimize this attack.
 State compromise extension attack succeeds when the attacker is able to recover unknown PRNG outputs from before internal state was compromised, or recover outputs from after the PRNG collected a sequence of inputs which the attacker cannot guess. A backtracking attack uses the compromise of the generator's internal state at some point to learn previous PRNG outputs. A permanent compromise attack occurs if, once an attacker compromises internal state at some point, all future and past internal state values are vulnerable to attack. Iterative guessing attack uses knowledge of internal state at some point and the intervening PRNG outputs, to learn internal state at a later point when the inputs collected during this span of time are guessable by the attacker. A meet in the middle attack is essentially a combination of an iterative guessing attack with backtracking attack. These are the reasons why internal state should be well guarded.
 If an attacker knows the seeds, which the generator uses to create random outputs, and gets all outputs it can test using different generators, which one creates the same output. An attacker can also use direct guess of starting point, use precomputed list of likely starting points or initialize PRNG with closely related inputs.
Different kind of cryptanalytic attack types are properly described in [4].
 One way to generate coming random numbers might be Michal Zalewski's spoofing set based on phase space analysis. Not tested, but seemed to work.
Methodology
Collecting (sampling)
Analysing
 The quality of random numbers and random number generators can be measured in a variety of ways. One common method is to compute the information density, or entropy, in a series of numbers. The higher the entropy in a series of numbers is, the more difficult it is to predict a given number on the basis of the preceding numbers in the series. A sequence of good random numbers will have a high level of entropy, although a high level of entropy does not guarantee randomness.
 There are many statistical test suites to check PRNGs. Those tests will not always give similar results, but they give direction. Comparing results they may give you a hint if used PRNG is good or not. Here are a few examples of statistical tests:
Diehard
DIEHARD [5] is a battery of tests for randomness developed by George Marsaglia. Tests are
Birthday spacing test
Choose m birthdays in a year of n days. List the spacings between the birthdays. If j is the number of values that occur more than once in that list, then j is asymptotically Poisson distributed with mean m^{3/(4n). Experience shows n must be quite large, say n > =2}18, for comparing the results to the Poisson distribution with that mean. This test uses n=2^{24 and m=2}9, so that the underlying distribution for j is taken to be Poisson with lambda=2^{27/(2}26)=2. A sample of 500 j's is taken, and a chisquare goodness of fit test provides a p value. The first test uses bits 124 (counting from the left) from integers in the specified file. Then the file is closed and reopened. Next, bits 225 are used to provide birthdays, then 326 and so on to bits 932. Each set of bits provides a pvalue, and the nine pvalues provide a sample for a KSTEST.
Overlapping 5permutation test
 This is the OPERM5 test. It looks at a sequence of one million 32bit random integers. Each set of five consecutive integers can be in one of 120 states, for the 5! possible orderings of five numbers. Thus the 5th, 6th, 7th,... numbers each provide a state. As many thousands of state transitions are observed, cumulative counts are made of the number of occurrences of each state. Then the quadratic form in the weak inverse of the 120x120 covariance matrix yields a test equivalent to the likelihood ratio test that the 120 cell counts came from the specified (asymptotically) normal distribution with the specified 120x120 covariance matrix (with rank 99). This version uses 1,000,000 integers, twice.
Binary rank test for 31x31 matrices
The leftmost 31 bits of 31 random integers from the test sequence are used to form a 31x31 binary matrix over the field {0,1}. The rank is determined. That rank can be from 0 to 31, but ranks < 28 are rare, and their counts are pooled with those for rank 28. Ranks are found for 40,000 such random matrices and a chisquare test is performed on counts for ranks 31,30,29 and < =28.
Binary rank test for 32x32 matrices
A random 32x32 binary matrix is formed, each row a 32bit random integer. The rank is determined. That rank can be from 0 to 32, ranks less than 29 are rare, and their counts are pooled with those for rank 29. Ranks are found for 40,000 such random matrices and a chisquare test is performed on counts for ranks 32,31,30 and < =29. :
Binary rank test for 6x8 matrices
From each of six random 32bit integers from the generator under test, a specified byte is chosen, and the resulting six bytes form a 6x8 binary matrix whose rank is determined. That rank can be from 0 to 6, but ranks 0,1,2,3 are rare; their counts are pooled with those for rank 4. Ranks are found for 100,000 random matrices, and a chisquare test is performed on counts for ranks 6,5 and < =4.
Bitstream test
The file under test is viewed as a stream of bits. Call them b1,b2,... . Consider an alphabet with two "letters", 0 and 1 and think of the stream of bits as a succession of 20letter "words", overlapping. Thus the first word is b1b2...b20, the second is b2b3...b21, and so on. The bitstream test counts the number of missing 20letter (20bit) words in a string of 2^{21 overlapping 20letter words. There are 2}20 possible 20 letter words. For a truly random string of 2^21+19 bits, the number of missing words j should be (very close to) normally distributed with mean 141,909 and sigma 428. Thus (j141909)/428 should be a standard normal variate (z score) that leads to a uniform [0,1) p value. The test is repeated twenty times.
Overlapping pairs sparse occupancy test
The OPSO test considers 2letter words from an alphabet of 1024 letters. Each letter is determined by a specified ten bits from a 32bit integer in the sequence to be tested. OPSO generates 2^{21 (overlapping) 2letter words (from 2}21+1 "keystrokes") and counts the number of missing wordsthat is 2letter words which do not appear in the entire sequence. That count should be very close to normally distributed with mean 141,909, sigma 290. Thus (missingwrds141909)/290 should be a standard normal variable. The OPSO test takes 32 bits at a time from the test file and uses a designated set of ten consecutive bits. It then restarts the file for the next designated 10 bits, and so on.
Overlapping quadruples sparse occupancy test
The test OQSO is similar, except that it considers 4letter words from an alphabet of 32 letters, each letter determined by a designated string of 5 consecutive bits from the test file, elements of which are assumed 32bit random integers. The mean number of missing words in a sequence of 2^{21 fourletter words, (2}21+3 "keystrokes"), is again 141909, with sigma = 295. The mean is based on theory; sigma comes from extensive simulation.
DNA test
The DNA test considers an alphabet of 4 letters C,G,A,T, determined by two designated bits in the sequence of random integers being tested. It considers 10letter words, so that as in OPSO and OQSO, there are 2^{20 possible words, and the mean number of missing words from a string of 2}21 (overlapping) 10letter words (2^21+9 "keystrokes") is 141909. The standard deviation sigma=339 was determined as for OQSO by simulation. (Sigma for OPSO, 290, is the true value (to three places), not determined by simulation.
Count the 1's test on a stream of bytes
Consider the file under test as a stream of bytes (four per 32 bit integer). Each byte can contain from 0 to 8 1's, with probabilities 1,8,28,56,70,56,28,8,1 over 256. Now let the stream of bytes provide a string of overlapping 5letter words, each "letter" taking values A,B,C,D,E. The letters are determined by the number of 1's in a byte: 0,1,or 2 yield A,3 yields B, 4 yields C, 5 yields D and 6,7 or 8 yield E. Thus we have a monkey at a typewriter hitting five keys with various probabilities (37,56,70,56,37 over 256). There are 5^{5 possible 5letter words, and from a string of 256,000 (overlapping) 5letter words, counts are made on the frequencies for each word. The quadratic form in the weak inverse of the covariance matrix of the cell counts provides a chisquare test Q5Q4, the difference of the naive Pearson sums of (OBSEXP)}2/EXP on counts for 5 and 4letter cell counts.
Count the 1's test for specific bytes
Consider the file under test as a stream of 32bit integers. From each integer, a specific byte is chosen , say the leftmost bits 1 to 8. Each byte can contain from 0 to 8 1's, with probabilities 1,8,28,56,70,56,28,8,1 over 256. Now let the specified bytes from successive integers provide a string of (overlapping) 5letter words, each "letter" taking values A,B,C,D,E. The letters are determined by the number of 1's, in that byte: 0,1,or 2  > A, 3  > B, 4  > C, 5  > D, and 6,7 or 8  > E. Thus we have a monkey at a typewriter hitting five keys with with various probabilities: 37,56,70, 56,37 over 256. There are 5^{5 possible 5letter words, and from a string of 256,000 (overlapping) 5letter words, counts are made on the frequencies for each word. The quadratic form in the weak inverse of the covariance matrix of the cell counts provides a chisquare test: Q5Q4, the difference of the naive Pearson sums of (OBSEXP)}2/EXP on counts for 5 and 4letter cell counts.
Parking lot test
 In a square of side 100, randomly "park" a cara circle of radius 1. Then try to park a 2nd, a 3rd, and so on, each time parking "by ear". That is, if an attempt to park a car causes a crash with one already parked, try again at a new random location. (To avoid path problems, consider parking helicopters rather than cars.) Each attempt leads to either a crash or a success, the latter followed by an increment to the list of cars already parked. If we plot n: the number of attempts, versus k: the number successfully parked, we get a curve that should be similar to those provided by a perfect random number generator. Theory for the behavior of such a random curve seems beyond reach, and as graphics displays are not available for this battery of tests, a simple characterization of the random experiment is used: k, the number of cars successfully parked after n=12,000 attempts. Simulation shows that k should average 3523 with sigma 21.9 and is very close to normally distributed. Thus (k3523)/21.9 should be a standard normal variable, which, converted to a uniform variable, provides input to a KSTEST based on a sample of 10.
Minimum distance test
It does this 100 times: choose n=8000 random points in a square of side 10000. Find d, the minimum distance between the (n^{2n)/2 pairs of points. If the points are truly independent uniform, then d}2, the square of the minimum distance should be (very close to) exponentially distributed with mean 0.995 . Thus 1exp(d^2/.995) should be uniform on [0,1) and a KSTEST on the resulting 100 values serves as a test of uniformity for random points in the square. Test numbers=0 mod 5 are printed but the KSTEST is based on the full set of 100 random choices of 8000 points in the 10000x10000 square.
3Dspheres test
 Choose 4000 random points in a cube of edge 1000. At each point, center a sphere large enough to reach the next closest point. Then the volume of the smallest such sphere is (very close to) exponentially distributed with mean 120pi/3. Thus the radius cubed is exponential with mean 30. (The mean is obtained by extensive simulation). The 3DSPHERES test generates 4000 such spheres 20 times. Each min radius cubed leads to a uniform variable by means of 1exp(r^3/30.), then a KSTEST is done on the 20 pvalues.
Squeeze test
Random integers are floated to get uniforms on [0,1). Starting with k=2^31=2147483647, the test finds j, the number of iterations necessary to reduce k to 1, using the reduction k=ceiling(k*U), with U provided by floating integers from the file being tested. Such j's are found 100,000 times, then counts for the number of times j was < =6,7,...,47, > =48 are used to provide a chisquare test for cell frequencies.
Overlapping sums test
 Integers are floated to get a sequence U(1),U(2),... of uniform [0,1) variables. Then overlapping sums,S(1)=U(1)+...+U(100), S2=U(2)+...+U(101),... are formed.The S's are virtually normal with a certain covariance matrix. A linear transformation of the S's converts them to a sequence of independent standard normals, which are converted to uniform variables for a KSTEST. The pvalues from ten KSTESTs are given still another KSTEST.
Runs test
 It counts runs up, and runs down, in a sequence of uniform [0,1) variables, obtained by floating the 32bit integers in the specified file. This example shows how runs are counted: .123,.357,.789,.425,.224,.416,.95 contains an uprun of length 3, a downrun of length 2 and an uprun of (at least) 2, depending on the next values. The covariance matrices for the runsup and runsdown are well known, leading to chisquare tests for quadratic forms in the weak inverses of the covariance matrices. Runs are counted for sequences of length 10,000. This is done ten times. Then repeated.
Craps test
It plays 200,000 games of craps, finds the number of wins and the number of throws necessary to end each game. The number of wins should be (very close to) a normal with mean 200000p and variance 200000p(1p), with p=244/495. Throws necessary to complete the game can vary from 1 to infinity, but counts for all > 21 are lumped with 21. A chisquare test is made on the no.ofthrows cell counts. Each 32bit integer from the test file provides the value for the throw of a die, by floating to [0,1), multiplying by 6 and taking 1 plus the integer part of the result.
NOTE: Most of the tests in DIEHARD return a pvalue, which should be uniform on [0,1) if the input file contains truly independent random bits. Those pvalues are obtained by p=F(X), where F is the assumed distribution of the sample random variable Xoften normal. But that assumed F is just an asymptotic approximation, for which the fit will be worst in the tails. Thus you should not be surprised with occasional pvalues near 0 or 1, such as .0012 or .9983. When a bit stream really FAILS BIG, you will get p's of 0 or 1 to six or more places. By all means, do not, as a Statistician might, think that a p < .025 or p > .975 means that the RNG has "failed the test at the .05 level". Such p's happen among the hundreds that DIEHARD produces, even with good RNG's. So keep in mind that " p happens".
 Generator to be tested must produce a binary file of 10 or 11 megabytes, that Diehard expects.
ENT
ENT [6] applies various tests to sequences of bytes stored in files and reports the results of those tests. ENT does not test generator, only sequence of bytes. ENT tests the following things:
Entropy
 The information density of the contents of the file, expressed as a number of bits per character.
Chisquare test
 The chisquare test is the most commonly used test for the randomness of data, and is extremely sensitive to errors in pseudorandom sequence generators. The chisquare distribution is calculated for the stream of bytes in the file and expressed as an absolute number and a percentage which indicates how frequently a truly random sequence would exceed the value calculated. We interpret the percentage as the degree to which the sequence tested is suspected of being nonrandom. If the percentage is greater than 99% or less than 1%, the sequence is almost certainly not random. If the percentage is between 99% and 95% or between 1% and 5%, the sequence is suspect. Percentages between 90% and 95% and 5% and 10% indicate the sequence is "almost suspect".
Arithmetic mean value
 This is simply the result of summing the all the bytes in the file and dividing by the file length. If the data are close to random, this should be about 127.5. If the mean departs from this value, the values are consistently high or low.
Monte Carlo value for Pi
 Each successive sequence of six bytes is used as 24 bit X and Y coordinates within a square. If the distance of the randomlygenerated point is less than the radius of a circle inscribed within the square, the sixbyte sequence is considered a "hit". The percentage of hits can be used to calculate the value of Pi. For very large streams (this approximation converges very slowly), the value will approach the correct value of Pi (Pi=3.1415926535...) if the sequence is close to random.
Serial correlation coefficient
 This quantity measures the extent to which each byte in the file depends upon the previous byte. For random sequences, this value (which can be positive or negative) will, of course, be close to zero. A nonrandom byte stream will yield a serial correlation coefficient on the order of 0.5. Wildly predictable data such as uncompressed bitmaps will exhibit serial correlation coefficients approaching 1.
NIST's statistical test suite
NIST's statistical test suite [7] contains 16 statistical test for random number generators. To run all those test, user needs to know, how PRNG works. Test requires for example sequence length in bits and block length. This test suite is not suitable for black box testing when generator is unknown. Test suite does the following tests:
Frequency (monobits) test
 The focus of the test is the proportion of zeroes and ones for the entire sequence. The purpose of this test is to determine whether that number of ones and zeros in a sequence are approximately the same as would be expected for a truly random sequence. The test assesses the closeness of the fraction of ones to ½, that is, the number of ones and zeroes in a sequence should be about the same.
Test for frequency within a block
 The focus of the test is the proportion of zeroes and ones within Mbit blocks. The purpose of this test is to determine whether the frequency of ones is an Mbit block is approximately M/2.
Runs test
 The focus of this test is the total number of zero and one runs in the entire sequence, where a run is an uninterrupted sequence of identical bits. A run of length k means that a run consists of exactly k identical bits and is bounded before and after with a bit of the opposite value. The purpose of the runs test is to determine whether the number of runs of ones and zeros of various lengths is as expected for a random sequence. In particular, this test determines whether the oscillation between such substrings is too fast or too slow.
Test for longest run of ones in a block
 The focus of the test is the longest run of ones within Mbit blocks. The purpose of this test is to determine whether the length of the longest run of ones within the tested sequence is consistent with the length of the longest run of ones that would be expected in a random sequence. Note that an irregularity in the expected length of the longest run of ones implies that there is also an irregularity in the expected length of the longest run of zeroes. Long runs of zeroes were not evaluated separately due to a concern about statistical independence among the tests.
Random binary matrix rank test
 The focus of the test is the rank of disjoint submatrices of the entire sequence. The purpose of this test is to check for linear dependence among fixed length substrings of the original sequence.
Discrete Fourier transform (spectral) test
 The focus of this test is the peak heights in the discrete Fast Fourier Transform. The purpose of this test is to detect periodic features (i.e., repetitive patterns that are near each other) in the tested sequence that would indicate a deviation from the assumption of randomness.
Nonoverlapping (aperiodic) template matching test
 The focus of this test is the number of occurrences of predefined target substrings. The purpose of this test is to reject sequences that exhibit too many occurrences of a given nonperiodic (aperiodic) pattern. For this test and for the Overlapping Template Matching test, an mbit window is used to search for a specific mbit pattern. If the pattern is not found, the window slides one bit position. For this test, when the pattern is found, the window is reset to the bit after the found pattern, and the search resumes.
Overlapping (periodic) template matching test
 The focus of this test is the number of predefined target substrings. The purpose of this test is to reject sequences that show deviations from the expected number of runs of ones of a given length. Note that when there is a deviation from the expected number of ones of a given length, there is also a deviation in the runs of zeroes. Runs of zeroes were not evaluated separately due to a concern about statistical independence among the tests. For this test and for the Nonoverlapping Template Matching test, an mbit window is used to search for a specific mbit pattern. If the pattern is not found, the window slides one bit position. For this test, when the pattern is found, the window again slides one bit, and the search is resumed.
Maurer's universal statistical test
 The focus of this test is the number of bits between matching patterns. The purpose of the test is to detect whether or not the sequence can be significantly compressed without loss of information. An overly compressible sequence is considered to be nonrandom.
LempelZiv Complexity test
 The focus of this test is the number of cumulatively distinct patterns (words) in the sequence. The purpose of the test is to determine how far the tested sequence can be compressed. The sequence is considered to be nonrandom if it can be significantly compressed. A random sequence will have a characteristic number of distinct patterns.
Linear complexity test
 The focus of this test is the length of a generating feedback register. The purpose of this test is to determine whether or not the sequence is complex enough to be considered random. Random sequences are characterized by a longer feedback register. A short feedback register implies nonrandomness.
Serial test
 The focus of this test is the frequency of each and every overlapping mbit pattern across the entire sequence. The purpose of this test is to determine whether the number of occurrences of the 2m mbit overlapping patterns is approximately the same as would be expected for a random sequence. The pattern can overlap.
Approximate entropy test
 The focus of this test is the frequency of each and every overlapping mbit pattern. The purpose of the test is to compare the frequency of overlapping blocks of two consecutive/adjacent lengths (m and m+1) against the expected result for a random sequence.
Cumulative sum (cusum) test
 The focus of this test is the maximal excursion (from zero) of the random walk defined by the cumulative sum of adjusted (1, +1) digits in the sequence. The purpose of the test is to determine whether the cumulative sum of the partial sequences occurring in the tested sequence is too large or too small relative to the expected behavior of that cumulative sum for random sequences. This cumulative sum may be considered as a random walk. For a random sequence, the random walk should be near zero. For nonrandom sequences, the excursions of this random walk away from zero will be too large.
Random excursions test
 The focus of this test is the number of cycles having exactly K visits in a cumulative sum random walk. The cumulative sum random walk is found if partial sums of the (0,1) sequence are adjusted to (1, +1). A random excursion of a random walk consists of a sequence of n steps of unit length taken at random that begin at and return to the origin. The purpose of this test is to determine if the number of visits to a state within a random walk exceeds what one would expect for a random sequence.
Random excursions variant test
 The focus of this test is the number of times that a particular state occurs in a cumulative sum random walk. The purpose of this test is to detect deviations from the expected number of occurrences of various states in the random walk.
Strange attractors
Michal Zalewski presented in his article of Strange attractors and TCP/IP sequence number analysis [8] one way to look at the randomness of believed random numbers. He presented a method of constructing spoofing sets that is based on phase space analysis and the presence of function attractors. Results of his examinations should not be considered as a reliable metric for comparing the relative strength of initial sequence number generators of different operating systems, but tests give a direction. Interesting 3D pictures helps to conclude if random number can be predicted. There are also available attractor analysis tools for users to test own random numbers.
Identifying
 Identifying the used PRNG scheme based on collected and analyzed data
 Making a wildguess [e.g. platform provided PRNG?]
Predicting
 Chosen samples (case study)
Results
Sequence numbers, which came with strange attractors and TCP/IP sequence number analysis [8], were tested using another tester, ENT [6]. It gave a bit different results. Chisquare test by ENT cut down most of the operating systems.
 Strange attractors and TCP/IP sequence number analysis showed, that at that moment OpenBSDcurrent (2.9 release) and Linux 2.2.1x got relatively good results. Same operating systems tested using ENT results were slightly different. OpenBSDcurrent passed also ENT's tests. Linux did not pass those tests, because it failed badly in chisquare test. Solaris 7 (tcp_strong_iss=2) based on the same source IP also passed all ENT's tests but failed in attractor analysis, which gave big 66 % attack feasibility. Results of Solaris 7 (tcp_strong_iss=2) with changing source IP addresses were following: Attractor analysis gave 0 % attack feasibility ant therefore it seemed to generate satisfactorily random numbers. ENT giving 97,5 % chisquare value said that there is reason to believe that sequence numbers are predictable. Only two of tested operating systems passed clearly all ENT's tests and those were OpenBSDcurrent and Solaris 7 (tcp_strong_iss=2) based on using the same source IP.
 Unfortunately it was not possible to test strange attractor's data with other test suites, because there was not enough material for DIEHARD and operations of generators were unknown, so the NIST's test suite was not suitable. Test suite requires sequence length in bits and block length, which were unknown.
Case study
 Sampling analysis
 (Optionally identification)
 (Optionally prediction)
 Attractor analysis data tested using ENT
Operating system
Attack feasibility % by attractor analysis
Entropy
Chisquare
Arithmetic mean value
Monte Carlo value for PI
Serial correlation coefficient
Conclusions by ENT
AIX4.3
100,00
6,73
0,01
97,65
3,576
0,1595
predictable
BSDI3.0
18,00
7,99
0,01
112,54
3,23
0,003258
predictable
BSDI4.0
18,00
7,999
0,01
127,156
3,163
0,001726
predictable
FreeBSD4.0
1,0
7,99
0,01
126,19
3,1369
0,0048
predictable
HPUX100
100,00
6,119
23
92,34
3,57
0,0837
predictable
HPUX101
100,00
7,26
0,01
99,2011
3,53
0,144
predictable
HPUX102
100,00
7,2
0,01
98,48
3,587
0,1158
predictable
HPUX1020
100,00
6,72
0,01
96,07
3,5778
0,1354
predictable
HPUX110
100,00
7,999
0,01
128,89
3,11
0,001185
predictable
IOS12.0
2,06
7,9988
0,01
127,69
3,1319
0,001591
predictable
IOS12.02
20,00
7,9980
0,50
127,215
3,1267
0,000573
predictable
IOSshort

7,97
0,01
128,0176
3,068
0,019268
predictable
IRIXshort
20,00
7,86
0,01
106,28
3,5265
0,081983
predictable
Linux2.2
below 0,05
7,8776
0,01
132,25
2,79
0,001192
predictable
Linux2.2.5

7,8296
0,01
110,40
3,5
0,057136
predictable
Linux2.4

7,987736
0,01
127,8807
3,0957
0,000402
predictable
MacOS
89,00
7,997
0,01
128,81
3,10029
0,003215
predictable
MacOSX1.0
17,00
7,99
0,01
127,93
3,1638
0,001190
predictable
OpenBSD2.8
3,05
7,99
0,01
127,4406
3,11474
0,000127
predictable
OpenBSDcurrent
0,00
7,995790
75,00
127,0587
3,144
0,001798
random
Solaris7
100,00
7,9971
0,01
125,74
3,1688
0,001726
predictable
Solaris7ISS0
100,00
6,711
0,01
154,45
2,384
0,122412
predictable
Solaris7ISS1
2,08
7,994
0,01
126,434
3,1111
0,000024
predictable
Solaris7ISS2
100,00
7,9993
99,00
127,37
3,1344
0,002421
suspect
Solaris7ISS2IP1
66,00
7,999006
25,00
127,5993
3,135656
0,002193
random
Solaris7ISS2IP2
0,00
7,998483
97,50
127,3678
3,13485
0,001699
suspect
Solaris8
100,00
7,999442
0,10
127,19
3,1397
0,00648
predictable
Solaris8ISS0
100,00
7,769873
0,01
127,054
3,1598
0,000752
predictable
Solaris8ISS1
100,00
7,999
0,01
126,43
3,164
0,000889
predictable
Solaris8ISS2

7,999651
97,50
127,41
3,1404977
0,000511
suspect
Windows2000
12,08
7,995
0,01
131,10
3,0252
0,003199
predictable
Windows95
100,00
6,91377
0,01
92,23
3,622
0,1127
predictable
Windows98SE
100,00
6,9595
0,01
92,15
3,640
0,1169
predictable
WindowsNT4SP3
97,00
6,435
0,01
65,92
3,998
0,002068
predictable
WindowsNT4SP6

6,75
0,01
92,09
3,75
0,1054
predictable
WindowsNT4SP6a
15
7,994
0,01
128,50
3,14915
0,001630
predictable
Analysis
Good(?) generators
Yarrow [2]
 Tiny [1] pp. 237238
 BlumBlumShub? (cryptographically secure, but period is not maximal! Statistically good or bad?)
NIST approved random number generators: [7]
 o DSS, FIPS 1862 appendix 3.1 o DSS, FIPS 1862 appendix 3.2 o ANSI X9.31 appendix A o ANSI X9.62 annex A.4.
Bad generators
 Linear congruential generators are susceptible to state attacks because they essentially give their entire internal state with every output. Also if coefficients of linear feedback shift register are chosen careless, period length will not be long enough. In that case PRNG's vulnerability does not depend on seeds. If the seed is known, even PRNGs such as Tiny and BlumBlumShub are completely predictable. Linear congruential PRNGs are the simplest and possibly worst of the mathematical library PRNG algorithms.
One analysis of PRNGs is included in Zalewski's Strange Attractors and TCP/IP sequence number analysis [8]. The analysis does not identify PRNGs but looks at the randomness of sequence numbers.
REFERENCES
[1]
John Viega, Gary ?McGraw. Building Secure Software, How to Avoid Security Problems the Right Way. (2002). AddisonWesley.
[2]
John Kelsey, Bruce Schneier, Niels Ferguson. (August 1999). "Yarrow160: Notes on the Design and Analysis of the Yarrow Cryptographic Pseudorandom Number Generator". Counterpane. http://www.counterpane.com/yarrow.html.
[3]
 Paul Garret. Making, Braking Codes An Introduction to Cryptology. (2001). Practice Hall, Inc..
[4]
John Kelsey, Bruce Schneier, David Wagner, Chris Hall. Cryptanalytic Attacks on Pseudorandom Number Generators. (1998). http://www.counterpane.com/pseudorandom_number.html.
[5]
George Marsaglia. "Diehard, A battery of tests for RNGs". http://stat.fsu.edu/~geo/diehard.html.
[6]
John Walker. (October 20th, 1998). "ENT, A Pseudorandom Number Sequence Test Program". http://www.fourmilab.ch/random/.
[7]
An interdivisional project in the Information Technology Laboratory at NIST. (page last update: December 8, 2000). "Random Number Generation and Testing". http://csrc.nist.gov/rng/index.html.
[8]
Michal Zalewski. (19 March  21 April, 2001). "Strange Attractors and TCP/IP Sequence Number Analysis". http://razor.bindview.com/publish/papers/tcpseq.html.
?CategoryNewSite