Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
341 views
in Technique[技术] by (71.8m points)

cluster analysis - R: Clustering results are different everytime I run

library(amap)
set.seed(5)
Kmeans(mydata, 5, iter.max=500, nstart=1, method="euclidean")

in 'amap' package and run several times, but even though the parameters and seed value are always the same, the clustering results are different every time I run Kmeans, or other cluster methods.

I tried another kmeans function in different packages, but still the same...

In fact, I want to use the Weka and R together, so I also tried SimpleKMeans in RWeka package, and this gives always the same value. However, the problem is that I do not know how to store the clustered data along with the cluster number from SimpleKmeans in RWeka so I'm stuck...

Anyhow, how can I keep the clustering result always the same? or How can I store the clustering result from SimpleKmeans into R?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You must be doing something wrong. I get reproducible results each time I run the following code, as long as I set the seed before each call to Kmeans():

library(amap)

out <- vector(mode = "list", length = 10)
for(i in seq_along(out)) {
    set.seed(1)
    out[[i]] <- Kmeans(iris[, -5], 3, iter.max=500, nstart=1, method="euclidean")
}

for(i in seq_along(out[-1])) {
    print(all.equal(out[[i]], out[[i+1]]))
}

The last for loop prints:

[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE

Indicating the results are exactly the same each time.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...