Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
489 views
in Technique[技术] by (71.8m points)

algorithm - A Good and SIMPLE Measure of Randomness

What is the best algorithm to take a long sequence of integers (say 100,000 of them) and return a measurement of how random the sequence is?

The function should return a single result, say 0 if the sequence is not all all random, up to, say 1 if perfectly random. It can give something in-between if the sequence is somewhat random, e.g. 0.95 might be a reasonably random sequence, whereas 0.50 might have some non-random parts and some random parts.

If I were to pass the first 100,000 digits of Pi to the function, it should give a number very close to 1. If I passed the sequence 1, 2, ... 100,000 to it, it should return 0.

This way I can easily take 30 sequences of numbers, identify how random each one is, and return information about their relative randomness.

Is there such an animal?

…..

Update 24-Sep-2019: Google may have just ushered in an era of quantum supremacy says:

"Google’s quantum computer was reportedly able to solve a calculation — proving the randomness of numbers produced by a random number generator — in 3 minutes and 20 seconds that would take the world’s fastest traditional supercomputer, Summit, around 10,000 years. This effectively means that the calculation cannot be performed by a traditional computer, making Google the first to demonstrate quantum supremacy."

So obviously there is an algorithm to "prove" randomness. Does anyone know what it is? Could this algorithm also provide a measure of randomness?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Your question answers itself. "If I were to pass the first 100,000 digits of Pi to the function, it should give a number very close to 1", except the digits of Pi are not random numbers so if your algorithm does not recognise a very specific sequence as being non-random then it's not very good.

The problem here is there are many types of non random-ness:- eg. "121,351,991,7898651,12398469018461" or "33,27,99,3000,63,231" or even "14297141600464,14344872783104,819534228736,3490442496" are definitely not random.

I think what you need to do is identify the aspects of randomness that are important to you- distribution, distribution of digits, lack of common factors, the expected number of primes, Fibonacci and other "special" numbers etc. etc.

PS. The Quick and Dirty (and very effective) test of randomness does the file end up roughly the same size after you gzip it.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...