Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
949 views
in Technique[技术] by (71.8m points)

windows - Reading the last n lines from a huge text file

I've tried something like this

file_in <- file("myfile.log","r")
x <- readLines(file_in, n=-100)

but I'm still waiting...

Any help would be greatly appreciated

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I'd use scan for this, in case you know how many lines the log has :

scan("foo.txt",sep="
",what="char(0)",skip=100)

If you have no clue how many you need to skip, you have no choice but to move towards either

  • reading in everything and taking the last n lines (in case that's feasible),
  • using scan("foo.txt",sep=" ",what=list(NULL)) to figure out how many records there are, or
  • using some algorithm to go through the file, keeping only the last n lines every time

The last option could look like :

ReadLastLines <- function(x,n,...){    
  con <- file(x)
  open(con)
  out <- scan(con,n,what="char(0)",sep="
",quiet=TRUE,...)

  while(TRUE){
    tmp <- scan(con,1,what="char(0)",sep="
",quiet=TRUE)
    if(length(tmp)==0) {close(con) ; break }
    out <- c(out[-1],tmp)
  }
  out
}

allowing :

ReadLastLines("foo.txt",100)

or

ReadLastLines("foo.txt",100,skip=1e+7)

in case you know you have more than 10 million lines. This can save on the reading time when you start having extremely big logs.


EDIT : In fact, I'd not even use R for this, given the size of your file. On Unix, you can use the tail command. There is a windows version for that as well, somewhere in a toolkit. I didn't try that out yet though.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...