Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
515 views
in Technique[技术] by (71.8m points)

c - Why 1103515245 is used in rand?

I'm talking about this surprisingly simple implementation of rand() from the C standard:

static unsigned long int next = 1;

int rand(void)  /* RAND_MAX assumed to be 32767. */
{
    next = next * 1103515245 + 12345;
    return (unsigned)(next/65536) % 32768;
}

From this Wikipedia article we know that the multiplier a (in above code a = 1103515245) should fulfill only 2 conditions:

  1. a - 1 is divisible by all prime factors of m.
    (In our case m = 2^32, size of the int, so m has only one prime factor = 2)
  2. a - 1 is a multiple of 4 if m is a multiple of 4.
    (32768 is multiple of 4, and 1103515244 too)

Why they have chosen such a strange, hard-to-remember, "man, I'm fed up with these random numbers, write whatever" number, like 1103515245?

Maybe there are some wise reasons, that this number is somehow better than the other?

For example, why don't set a = 20000000001? It's bigger, cool-looking and easier to remember.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

If you use a LCG to draw points on the d dimensional space, they will lie on at most (d!m)1/d hyperplanes. This is a known defect of LCGs.

If you don't carefully choose a and m (beyond the condition for full periodicity), they may lie on much fewer planes than that. Those numbers have been selected by what is called the spectral test.

The "spectral test" (the name comes from number theory) is the maximum distance between consecutive hyperplanes on which d-dimensional joint distributions lie. You want it to be as small as possible for as many d as you can test.

See this paper for a historical review on the topic. Note that the generator you quote is mentioned in the paper (as ANSIC) and determined to not be very good. The high order 16 bits are acceptable, however, but many applications will need more than 32768 distinct values (as you point out in the comments, the period is indeed 2^31 -- the conditions for full periodicity in Wikipedia's link are probably only necessary).

The original source code in the ANSI document did not take the high order 16 bits, yielding a very poor generator which is easy to misuse (rand() % n is what people first think of to draw a number between 0 and n, and this yields something very non-random in this case).

See also the discussion on LCGs in Numerical Recipes. Quoting:

Even worse, many early generators happened to make particularly bad choices for m and a. One infamous such routine, RANDU, with a = 65539 and m = 231, was widespread on IBM mainframe computers for many years, and widely copied onto other systems. One of us recalls as a graduate student producing a “random” plot with only 11 planes and being told by his computer center’s programming consultant that he had misused the random number generator: “We guarantee that each number is random individually, but we don’t guarantee that more than one of them is random.” That set back our graduate education by at least a year!


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...