Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
688 views
in Technique[技术] by (71.8m points)

text - How to use R to create a word co-occurrence matrix

I am a newbie in r. I have a set of data about online videos and their tags. The data looks like

film  tag1 tag2 tag3 tag4....
1      A    B    C    D
2      A    C    F    G 
3      B    D    C    X 

I want to create a matrix which tells me the co-occurrence of the tags, such as:

       A    B   C    D .....
A     10    13
B     15    2
C      3    16
D     9     20

How should I do it?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

If I understand what you want here is one way:

dat <- read.table(text='film  tag1 tag2 tag3 tag4
1      A    B    C    D
2      A    C    F    G 
3      B    D    C    X', header=T)

library(qdapTools)
crossprod(as.matrix(mtabulate(as.data.frame(t(dat[, -1])))))

Giving:

  A B C D F G X
A 2 1 2 1 1 1 0
B 1 2 2 2 0 0 1
C 2 2 3 2 1 1 1
D 1 2 2 2 0 0 1
F 1 0 1 0 1 1 0
G 1 0 1 0 1 1 0
X 0 1 1 1 0 0 1

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...