Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
870 views
in Technique[技术] by (71.8m points)

dataframe - how to run lm regression for every column in R

I have data frame as:

df=data.frame(x=rnorm(100),y1=rnorm(100),y2=rnorm(100),y3=...)

I want to run a loop which regresses each column starting from the second column on the first column:

for(i in names(df[,-1])){
    model = lm(i~x, data=df)
}

But I failed. The point is that I want to do a loop of regression for each column and some column names is just a number (e.g. 404.1). I cannot find a way to run a loop for each column using the above command.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Your code looks fine except when you call i within lm, R will read i as a string, which you can't regress things against. Using get will allow you to pull the column corresponding to i.

df=data.frame(x=rnorm(100),y1=rnorm(100),y2=rnorm(100),y3=rnorm(100))

storage <- list()
for(i in names(df)[-1]){
  storage[[i]] <- lm(get(i) ~ x, df)
}

I create an empty list storage, which I'm going to fill up with each iteration of the loop. It's just a personal preference but I'd also advise against how you've written your current loop:

 for(i in names(df[,-1])){
    model = lm(i~x, data=df)
}

You will overwrite model, thus returning only the last iteration results. I suggest you change it to a list, or a matrix where you can iteratively store results.

Hope that helps


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...