Pure mathematics:
Let's calculate the quantity of rand()
function calls in both cases and compare the results:
Case 1:
let's see the mathematical expectation of calls on step i = k
, when you already have k numbers chosen. The probability to get a number with one rand()
call is equal to p = (n-k)/n
. We need to know the mathematical expectation of such calls quantity which leads to obtaining a number we don't have yet.
The probability to get it using 1
call is p
. Using 2
calls - q * p
, where q = 1 - p
. In general case, the probability to get it exactly after n
calls is (q^(n-1))*p
. Thus, the mathematical expectation is
Sum[ n * q^(n-1) * p ], n = 1 --> INF
. This sum is equal to 1/p
(proved by wolfram alpha).
So, on the step i = k
you will perform 1/p = n/(n-k)
calls of the rand()
function.
Now let's sum it overall:
Sum[ n/(n - k) ], k = 0 --> m - 1 = n * T
- the number of rand
calls in method 1.
Here T = Sum[ 1/(n - k) ], k = 0 --> m - 1
Case 2:
Here rand()
is called inside random_shuffle
n - 1
times (in most implementations).
Now, to choose the method, we have to compare these two values: n * T ? n - 1
.
So, to choose the appropriate method, calculate T
as described above. If T < (n - 1)/n
it's better to use the first method. Use the second method otherwise.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…