Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
603 views
in Technique[技术] by (71.8m points)

parallel processing - R doesn't reset the seed when "L'Ecuyer-CMRG" RNG is used?

I was doing some parallel simulations in R and I notice that the seed is not changed when the "L'Ecuyer-CMRG" rng is used. I was reading the book "Parallel R", and the option mc.set.seed = TRUE should give each worker a new seed each time mclapply() is called.

Here is my code:

library(parallel)
RNGkind("L'Ecuyer-CMRG")

mclapply(1:2, function(n) rnorm(n), mc.set.seed = TRUE)
[[1]]
[1] -0.7125037

[[2]]
[1] -0.9013552  0.3445190

mclapply(1:2, function(n) rnorm(n), mc.set.seed = TRUE)
[[1]]
[1] -0.7125037

[[2]]
[1] -0.9013552  0.3445190

EDIT: same thing happens both on my desktop and on my laptop (both Ubuntu 12.04 LTS).

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

It appears to me that if you want to guarantee that subsequent calls to mclapply in an R session get different random numbers, you need to either call set.seed with a different value, remove the global variable ".Random.seed", or generate at least one random number in that R session before calling mclapply again.

The reason for this behavior is that mclapply (unlike mcparallel for example) calls mc.reset.stream internally. This resets the seed that is stashed in the "parallel" package to the value of ".Random.seed", so if ".Random.seed" hasn't changed when mclapply is called again, the workers started by mclapply will get the same random numbers as they did previously.

Note that this is not the case with functions such as clusterApply and parLapply, since they use persistent workers, and therefore continue to draw random numbers from their RNG stream. But new workers are forked every time mclapply is called, presumably making it much harder to have that type of behavior.

Here's an example of setting the seed to different values in order to get different random numbers using mclapply:

RNGkind("L'Ecuyer-CMRG")
set.seed(100)
mclapply(1:2, function(i) rnorm(2))
set.seed(101)
mclapply(1:2, function(i) rnorm(2))

Here's an example of removing ".Random.seed":

RNGkind("L'Ecuyer-CMRG")
mclapply(1:2, function(i) rnorm(2))
rm(.Random.seed)
mclapply(1:2, function(i) rnorm(2))

And here's an example of generating random numbers on the master:

RNGkind("L'Ecuyer-CMRG")
mclapply(1:2, function(i) rnorm(2))
rnorm(1)
mclapply(1:2, function(i) rnorm(2))

I'm not sure which is the best approach, but that may depend on what you're trying to do.

Although it appears that simply calling mclapply multiple times without changing ".Random.seed" results in reproducible results, I don't know if that is guaranteed. To guarantee reproducible results, I think you need to call set.seed:

RNGkind("L'Ecuyer-CMRG")
set.seed(1234)
mclapply(1:2, function(i) rnorm(2))
set.seed(1234)
mclapply(1:2, function(i) rnorm(2))

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...