Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
182 views
in Technique[技术] by (71.8m points)

Which compression method to use in PHP?

I have a large amount of data to move using two PHP scripts: one on the client side using a command line PHP script and other behind Apache. I POST the data to the server side and use php://input stream to save it on the web-server end. To prevent from reaching any memory limits, data is separated into 500kB chunks for each POST request. All this works fine.

Now, to save the bandwidth and speed it up, I want to compress the data before sending and decompress when received on the other end. I found 3 pairs of functions that can do the job, but I cannot decide which one to use:

Which pair of functions would you recommend and why?

UPDATE: I just read zlib FAQ:

The gzip format (gzencode) was designed to retain the directory information about a single file, such as the name and last modification date. The zlib format (gzcompress) on the other hand was designed for in-memory and communication channel applications, and has a much more compact header and trailer and uses a faster integrity check than gzip.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

All of these can be used. There are subtle differences between the three:

  • gzencode() uses the GZIP file format, the same as the gzip command line tool. This file format has a header containing optional metadata, DEFLATE compressed data, and footer containing a CRC32 checksum and length check.
  • gzcompress() uses the ZLIB format. It has a shorter header serving only to identify the compression format, DEFLATE compressed data, and a footer containing an ADLER32 checksum.
  • gzdeflate() uses the raw DEFLATE algorithm on its own, which is the basis for both of the other formats.

All three use the same algorithm under the hood, so they won't differ in speed or efficiency. gzencode() adds the ability to include the original file name and other environmental data (this is unused when you are just compressing a string). gzencode() and gzcompress() both add a checksum, so the integrity of the archive can be verified, which can be useful over unreliable transmission and storage methods. If everything is stored locally and you don't need any additional metadata then gzdeflate() would suffice. For portability I'd recommend gzencode() (GZIP format) which is probably better supported than gzcompress() (ZLIB format) among other tools.

When compressing very short strings the overhead of each method becomes relevant since for very short input the overhead can comprise a significant part of the output. The overhead for each method, measured by compressing an empty string, is:

  • gzencode('') = 20 bytes
  • gzcompress('') = 8 bytes
  • gzdeflate('') = 2 bytes

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...