Porcine Joy

Spring showers bring pig flowers

Well, maybe not, but I haven’t featured any pigs recently. That could be in violation of the administrivial oink’s website charter.

The Rite of Spring by Igor Stravinsky

Opening measures from the Rite of Spring

The Ides of March are past, and we head into spring. We also approach the one-year anniversary of this blog on March 21 or thereabouts, right in time for the vernal equinox. I would like to take this opportunity to thank my loyal subscribers, all three of you, and my other readers, whether frequent or occasional. Never hesitate to leave comments, especially if it isn’t spam!

Statistics, probability and applied quantitative methods reading recommendations

Finally, I wish acknowledge subject matter experts with nice blogs in my fields of professional interest, including applied probability, (mostly) frequentist methods, and due diligence for purposes of financial and security-focused anomaly detection.

Stats with Cats I don’t like cats, but you can just ignore the photos. This is an accessible, frequently updated blog covering descriptive and inferential statistical methods, mostly explained through charming examples

Data Genetics This blog has excellent graphics (without gratuitous interactive data visualization!) accompanying posts demonstrating statistical, probability and mathematical methods for engineering as applied to a wide range of real world concerns e.g. using Benford’s Law to detect accounting fraud, Hamming Codes for error correction and solving combinatorics problems to demonstrate the poor odds for winning dice and card games.

Error Statistics Philosophy Error statistics quantifies how frequently and reliably different statistical models can be used to detect errors.Error probability statistics uses frequentist error probabilities, not frequentist probability. Frequentist error probability is the relative frequency of errors within a statistical model. Frequentist probability is merely the use of relative frequency of occurrence to infer probability of events. The introductory post, Frequentists in exile acknowledges the long-held perception that only Bayesian methods have respectable statistical foundations. Error Statistics Philosophy focuses on the defensible use of frequentist methods for probability and statistical models, especially in circumstances of limited information and high error avoidance requirements.

Now for a bit of self-promotion…

My Google hobby blog,  In the GooglePlex, is ready to be included on the blogroll here. I was very careful about URL choice. SEO was NOT a consideration! Potential trademark infringement was my concern.

In the GooglePlex is a themed site, unlike this one! Topics are Google-related: product news (both good and bad), trivia, corporate history and humor, whenever there is any to be found. Please feel free to drop by to say hello or ask questions.

Oink of Joy

Any rite of Spring should include an oinker of joy

Published in: on March 20, 2011 at 5:45 am  Comments (1)  
Tags: , ,

Random Numbers in Java

This post is about alternatives to the java.util.Random class, the most commonly used method to generate random numbers in Java.

Dice players of antiquity

Random: Playing the odds in Ancient Rome Osteria della Via de Mercurio

 java.util.Random

This class generates “random enough” values, which are described as having an informal level of randomness. Such numbers have a superficial appearance of randomness to an observer, usually a human observer, not a machine. These are considered low-quality random numbers. They should never be used in any security application.

True randomly-generated numbers must have the following characteristics:

  • Any number is equally likely to be generated on every iteration. This is also known as a uniformly distributed series of numbers.
  • The second criterion follows directly: No dependency exists between successive numbers as they are generated.

Alternatives to better match your random needs

Neil Coffey of Javamex describes three alternative random number generation methods. Each approach continues to use class java.util.Random, while replacing the underlying algorithm. The first, called the XORShift generator, produces medium-quality random numbers very quickly, with only a single state variable and very simple code. This method is very well suited to J2ME games.

The next algorithm generates much higher-quality random numbers. It is a combined generator, using two XORShift generators of the sort described above. Mr. Coffey provides the code and explanation for the algorithm. This combined XORShift yields good-quality random numbers. It is suitable for non-gambling games and simulations, although it runs slightly slower than java.util.Random.

A cryptographic quality random number generator should have the following properties:

  1. It should be impossible to predict prior and future numbers from any number generated;
  2. The numbers should have no discernible biases;
  3. The generator has a large period;
  4. The generator can seed itself at any position within that period with equal probability.

A cryptographically secure random number generator is appropriate for security applications such as producing a web-server session id or picking encryption keys. Very high-quality random numbers are generated using java.security.SecureRandom as the replacement for java.util.Random. The trade-off in quality versus CPU cycle consumption is hardly surprising.  java.security.SecureRandom runs 20 to 30 times slower than any of the other algorithms.

Relative comparison between the methods

Here is a very simplified way of understanding the difference between each algorithm.

Let’s say that we need to generate a 128-bit encryption key. java.security.SecureRandom actually picks from a pool of 2 raised to the 127th power number of possible keys. Of course java.util.Random can also be used to generate a 128-bit key. However, the values will be selected from a smaller pool of numbers, on the order of 2 raised to the 47th power number of possible keys. This is because java.util.Random has a much shorter period, equal to 2 raised to the 48th power.

The single XORShift generator method falls between the two, as it has a slightly longer period, of 2 raised to the 64th power. The combined XORShift generator approach extends the period a bit further.

Note than neither java.util.Random nor either of the XORShift generators are seeded randomly. This is why java.security.SecureRandom, with a machine-generated and much more truly random random seed, is superior.

* The machine-generated random seed is what is called entropy in random number generators.