Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
251 views
in Technique[技术] by (71.8m points)

Asking for ideas to plot large variables against their R squared values (drawn from linear regression models) in R (preferably using GGPLOT2)

I had to build 231 linear regression models for my project. After running 231 models, I am left with 231 R squared values that I have to present in a plot against the variable names. Since 231 R squared values are too many for a table, I am looking for a plotting ideas so I can show R squared values as y-axis and variable names as x-axis. When I run dput(head(df, 5)) I get this (which may give you an idea of my data):

structure(list(Band = c(402, 411, 419, 427, 434), R.squared = c(0.044655015122032, 
0.852028718800355, 0.818617476505653, 0.825782272278991, 0.860844967662728
), Adj.Rsquared = c(-0.0614944276421867, 0.835587465333728, 0.798463862784058, 
0.806424746976656, 0.845383297403031), Intercept = c(0.000142126282140086, 
-0.00373545760470339, -0.00258909036368109, 0.000626075834918527, 
-3.3448513588372e-05), Slope = c(-0.00108714482110104, 0.393380133190131, 
0.443463459485279, 0.503881831479685, 0.480162723468755)), row.names = c(NA, 
5L), class = "data.frame")

Please note that my full data have 231 observations and I want to plot the variable band (as a factor) as an x-axis and R squared as a y-axis. I already tried geom_point() in ggplot2 but it looks very messy and complicated to understand. Any ideas?

Update: when I use the suggested code by @Duck I get this plot which is a little messy to use for a scientific presentation.plot


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

If you have a large number of values you can dodge the labels in axis, here an example:

library(ggplot2)
#Code
ggplot(mdf,aes(x=factor(Band),y=R.squared))+
  geom_point()+
  scale_x_discrete(guide = guide_axis(n.dodge=2))+
  coord_flip()

Output:

enter image description here

Some data used:

#Data
mdf <- structure(list(Band = c(402, 411, 419, 427, 434, 412, 421, 429, 
437, 444, 422, 431, 439, 447, 454, 432, 441, 449, 457, 464), 
    R.squared = c(0.044655015122032, 0.852028718800355, 0.818617476505653, 
    0.825782272278991, 0.860844967662728, 0.044655015122032, 
    0.852028718800355, 0.818617476505653, 0.825782272278991, 
    0.860844967662728, 0.044655015122032, 0.852028718800355, 
    0.818617476505653, 0.825782272278991, 0.860844967662728, 
    0.044655015122032, 0.852028718800355, 0.818617476505653, 
    0.825782272278991, 0.860844967662728), Adj.Rsquared = c(-0.0614944276421867, 
    0.835587465333728, 0.798463862784058, 0.806424746976656, 
    0.845383297403031, -0.0614944276421867, 0.835587465333728, 
    0.798463862784058, 0.806424746976656, 0.845383297403031, 
    -0.0614944276421867, 0.835587465333728, 0.798463862784058, 
    0.806424746976656, 0.845383297403031, -0.0614944276421867, 
    0.835587465333728, 0.798463862784058, 0.806424746976656, 
    0.845383297403031), Intercept = c(0.000142126282140086, -0.00373545760470339, 
    -0.00258909036368109, 0.000626075834918527, -3.3448513588372e-05, 
    0.000142126282140086, -0.00373545760470339, -0.00258909036368109, 
    0.000626075834918527, -3.3448513588372e-05, 0.000142126282140086, 
    -0.00373545760470339, -0.00258909036368109, 0.000626075834918527, 
    -3.3448513588372e-05, 0.000142126282140086, -0.00373545760470339, 
    -0.00258909036368109, 0.000626075834918527, -3.3448513588372e-05
    ), Slope = c(-0.00108714482110104, 0.393380133190131, 0.443463459485279, 
    0.503881831479685, 0.480162723468755, -0.00108714482110104, 
    0.393380133190131, 0.443463459485279, 0.503881831479685, 
    0.480162723468755, -0.00108714482110104, 0.393380133190131, 
    0.443463459485279, 0.503881831479685, 0.480162723468755, 
    -0.00108714482110104, 0.393380133190131, 0.443463459485279, 
    0.503881831479685, 0.480162723468755)), row.names = c(NA, 
-20L), class = "data.frame")

The suggestion from @DaveArmstrong is very helpful too (Many thanks and credits to him):

#Code 2
ggplot(mdf,aes(x=reorder(factor(Band), R.squared, mean),y=R.squared))+
  geom_point()+
  scale_x_discrete(guide = guide_axis(n.dodge=2))+
  coord_flip()

Output:

enter image description here

Another option:

#Code 3
ggplot(mdf,aes(x=reorder(factor(Band), R.squared, mean),y=R.squared))+
  geom_point()+
  geom_segment( aes(x=reorder(factor(Band), R.squared, mean),
                    xend=reorder(factor(Band), R.squared, mean),
                    y=0,
                    yend=R.squared))+
  scale_x_discrete(guide = guide_axis(n.dodge=2))+
  coord_flip()

Output:

enter image description here


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...