Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
449 views
in Technique[技术] by (71.8m points)

java - Find the size of the file inside a GZIP file

Is there a way to find out the size of the original file which is inside a GZIP file in java?

As in, I have a file a.txt of 15 MB which has been GZipped to a.gz of size 3MB. I want to know the size of a.txt present inside a.gz, without unzipping a.gz.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

There is no truly reliable way, other than gunzipping the stream. You do not need to save the result of the decompression, so you can determine the size by simply reading and decoding the entire file without taking up space with the decompressed result.

There is an unreliable way to determine the uncompressed size, which is to look at the last four bytes of the gzip file, which is the uncompressed length of that entry modulo 232 in little endian order.

It is unreliable because a) the uncompressed data may be longer than 232 bytes, and b) the gzip file may consist of multiple gzip streams, in which case you would find the length of only the last of those streams.

If you are in control of the source of the gzip files, you know that they consist of single gzip streams, and you know that they are less than 232 bytes uncompressed, then and only then can you use those last four bytes with confidence.

pigz (which can be found at http://zlib.net/pigz/ ) can do it both ways. pigz -l will give you the unreliable length very quickly. pigz -lt will decode the entire input and give you the reliable lengths.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...