Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
524 views
in Technique[技术] by (71.8m points)

performance - Read large amount of data from file in Java

I've got text file that contains 1 000 002 numbers in following formation:

123 456
1 2 3 4 5 6 .... 999999 100000

Now I need to read that data and allocate it to int variables (the very first two numbers) and all the rest (1 000 000 numbers) to an array int[].

It's not a hard task, but - it's horrible slow.

My first attempt was java.util.Scanner:

 Scanner stdin = new Scanner(new File("./path"));
 int n = stdin.nextInt();
 int t = stdin.nextInt();
 int array[] = new array[n];

 for (int i = 0; i < n; i++) {
     array[i] = stdin.nextInt();
 }

It works as excepted but it takes about 7500 ms to execute. I need to fetch that data in up to several hundred of milliseconds.

Then I tried java.io.BufferedReader:

Using BufferedReader.readLine() and String.split() I got the same results in about 1700 ms, but it's still too many.

How can I read that amount of data in less that 1 second? The final result should be equal to:

int n = 123;
int t = 456;
int array[] = { 1, 2, 3, 4, ..., 999999, 100000 };

According to trashgod answer:

StreamTokenizer solution is fast (takes about 1400 ms) but it's still too slow:

StreamTokenizer st = new StreamTokenizer(new FileReader("./test_grz"));
st.nextToken();
int n = (int) st.nval;

st.nextToken();
int t = (int) st.nval;

int array[] = new int[n];

for (int i = 0; st.nextToken() != StreamTokenizer.TT_EOF; i++) {
    array[i] = (int) st.nval;
}

PS. There is no need for validation. I'm 100% sure that data in ./test_grz file is correct.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Thanks for every answer but I've already found a method that meets my criteria:

BufferedInputStream bis = new BufferedInputStream(new FileInputStream("./path"));
int n = readInt(bis);
int t = readInt(bis);
int array[] = new int[n];
for (int i = 0; i < n; i++) {
    array[i] = readInt(bis);
}

private static int readInt(InputStream in) throws IOException {
    int ret = 0;
    boolean dig = false;

    for (int c = 0; (c = in.read()) != -1; ) {
        if (c >= '0' && c <= '9') {
            dig = true;
            ret = ret * 10 + c - '0';
        } else if (dig) break;
    }

    return ret;
}

It requires only about 300 ms to read 1 mln of integers!


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...