I am learning how to read efficiently very large files in Go. I have tried bufio.NewScanner
and bufio.NewReader
with ReadString('
')
. Among both options, NewScanner
seems to be consistently faster (2:1).
For NewScanner
I found it takes much more time to read a file line by line than running a unix cat command to read the file.
I have measured how long does it take to run this code:
package main
import (
"bufio"
"fmt"
"os"
)
func main() {
file, _ := os.Open("test")
scanner := bufio.NewScanner(file)
for scanner.Scan() {
fmt.Println(scanner.Text())
}
}
when you compare against a regular unix cat
output I get the following results:
$ time ./parser3 > /dev/null
19.13 real 13.81 user 5.94 sys
$ time cat test > /dev/null
0.83 real 0.08 user 0.74 sys
The time difference is consistent among several executions.
I understand that scanning for '
'
adds overhead than rather just copying data from input to output as cat does.
But seeing the difference between cat
and this code snippet I am asking myself if this is the most efficient way to read a file line by line in Go.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…