Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
285 views
in Technique[技术] by (71.8m points)

r - Create unique identifier from the interchangeable combination of two variables

I need to create a unique identifier from the combination of two variables in a data frame. Consider the following data frame:

 df <- data.frame(col1 = c("a", "a", "b", "c"), col2 = c("c", "b", "c", "a"), id = c(1,2,3,1))

The variable "id" is not in the data set; that's the one I would like to create. Essentially, I want every combination of the variables col1 and col2 to be treated interchangeably, e.g. the combination of c("a", "c") is the same as c("c", "a").

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You could do:

labels <- apply(df[, c("col1", "col2")], 1, sort)
df$id <- as.numeric(factor(apply(labels, 2, function(x) paste(x, collapse=""))))

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...