Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
305 views
in Technique[技术] by (71.8m points)

r - Concatenate several columns to comma separated strings by group

Background: I am in the process of annotating SNPs from a GWAS in an organism without much annotation. I am using the chained tBLASTn table from UCSC along with biomaRt to map each SNP to a probable gene(s).

I have a dataframe that looks like this:

            SNP   hu_mRNA     gene
 chr1.111642529 NM_002107    H3F3A
 chr1.111642529 NM_005324    H3F3B
 chr1.111801684 BC098118     <NA>
 chr1.111925084 NM_020435    GJC2
  chr1.11801605 AK027740     <NA>
  chr1.11801605 NM_032849    C13orf33
 chr1.151220354 NM_018913    PCDHGA10
 chr1.151220354 NM_018918    PCDHGA5

What I would like to end up with is a single row for each SNP, and comma delimit the genes and hu_mRNAs. Here is what I am after:

            SNP            hu_mRNA    gene
 chr1.111642529 NM_002107,NM_005324   H3F3A
 chr1.111801684  BC098118,NM_020435   GJC2
  chr1.11801605  AK027740,NM_032849   C13orf33
 chr1.151220354 NM_018913,NM_018918   PCDHGA10,PCDHGA5

Now I know I can do this with a flick of the wrist in perl, but I really want to do this all in R. Any suggestions?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You could do this in one line using plyr, as it is a classic split-apply-combine problem. You split using SNP, apply paste with collapse and assemble the pieces back into a data frame.

plyr::ddply(x, .(SNP), colwise(paste), collapse = ",")

If you want to do data reshaping in R at the flick of a wrist, learn plyr and reshape2 :). Another flick of the wrist solution using data.table, really useful if you are dealing with massive amounts of data.

data.table::data.table(x)[,lapply(.SD, paste, collapse = ","),'SNP']

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...