Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
324 views
in Technique[技术] by (71.8m points)

reshape - Converting R Column names into id variables

I'm quite confused and haven't even been able to search for what I'm looking for. I have a multi-year survey on different countries, which is currently like this:

        Question  Year  CountryA  CountryB  ...  CountryZ
        1         1999       Yes        No             No 
        2         1999       Yes        Yes            Yes

That is, it's currently organized by question. I want to have the data arranged by country, year and question number as such:

Country  Year  Question  Answer
      A  1999         1     Yes
      A  1999         2     Yes
      B  1999         1      No
      B  1999         2     Yes

And so on. Is this even possible? I can't seem to find anything to guide me to the right answer.
Thanks in advance! See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The most direct approach is to use melt from "reshape2". Assuming your data.frame is called "mydf":

> library(reshape2)
> melt(mydf, id.vars=1:2)
  Question Year variable value
1        1 1999 CountryA   Yes
2        2 1999 CountryA   Yes
3        1 1999 CountryB    No
4        2 1999 CountryB   Yes
5        1 1999 CountryZ    No
6        2 1999 CountryZ   Yes

Update

My mind's not working on how to properly deal with the resulting names from base reshape, but you can also do something like this:

names(mydf) <- sub("Country", "Country.", names(mydf))
setNames(
  reshape(mydf, direction="long", idvar=1:2, varying=3:ncol(mydf)),
  c("Question", "Year", "Country", "Answer"))
#          Question Year Country Answer
# 1.1999.A        1 1999       A    Yes
# 2.1999.A        2 1999       A    Yes
# 1.1999.B        1 1999       B     No
# 2.1999.B        2 1999       B    Yes
# 1.1999.Z        1 1999       Z     No
# 2.1999.Z        2 1999       Z    Yes

Where:

mydf <- structure(list(Question = 1:2, Year = c(1999L, 1999L), CountryA = c("Yes", 
  "Yes"), CountryB = c("No", "Yes"), CountryZ = c("No", "Yes")), .Names = c("Question", 
  "Year", "CountryA", "CountryB", "CountryZ"), class = "data.frame", row.names = c(NA, -2L))

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...