Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
664 views
in Technique[技术] by (71.8m points)

c - zlib, deflate: How much memory to allocate?

I am using zlib to compress a stream of text data. The text data comes in chunks, and for each chunk, deflate() is called, with flush set to Z_NO_FLUSH. Once all chunks have been retrieved, deflate() is called with flush set to Z_FINISH.

Naturally, deflate() doesn't produce compressed output on each call. It internally accumulates data to achieve a high compression rate. And that's fine! Every time deflate() produces compressed output, that output is appended to a database field - a slow process.

However, once deflate() produces compressed data, that data may not fit into the provided output buffer, deflate_out. Therefore several calls to deflate() are required. And that is what I want to avoid:

Is there a way to make deflate_out always large enough so that deflate() can store all the compressed data in it, every times it decides to produce output?

Notes:

  • The total size of the uncompressed data is not known beforehand. As mentioned above, the uncompressed data comes in chunks, and the compressed data is appended to a database field, also in chunks.

  • In the include file zconf.h I have found the following comment. Is that perhaps what I am looking for? I.e. is (1 << (windowBits+2)) + (1 << (memLevel+9)) the maximum size in bytes of compressed data that deflate() may produce?

    /* The memory requirements for deflate are (in bytes):
                (1 << (windowBits+2)) +  (1 << (memLevel+9))
     that is: 128K for windowBits=15  +  128K for memLevel = 8  (default values)
     plus a few kilobytes for small objects. For example, if you want to reduce
     the default memory requirements from 256K to 128K, compile with
         make CFLAGS="-O -DMAX_WBITS=14 -DMAX_MEM_LEVEL=7"
     Of course this will generally degrade compression (there's no free lunch).
    
       The memory requirements for inflate are (in bytes) 1 << windowBits
     that is, 32K for windowBits=15 (default value) plus a few kilobytes
     for small objects.
    */
    
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

deflateBound() is helpful only if you do all of the compression in a single step, or if you force deflate to compress all of the input data currently available to it and emit compressed data for all of that input. You would do that with a flush parameter such as Z_BLOCK, Z_PARTIAL_FLUSH, etc.

If you want to use Z_NO_FLUSH, then it becomes far more difficult as well as inefficient to attempt to predict the largest amount of output deflate() might emit on the next call. You don't know how much of the input was consumed at the time the last burst of compressed data was emitted, so you need to assume almost none of it, with the buffer size growing unnecessarily. However you attempt to estimate the maximum output, you will be doing a lot of unnecessary mallocs or reallocs for no good reason, which is inefficient.

There is no point to avoid calling deflate() for more output. If you simply loop on deflate() until it has no more output for you, then you can use a fixed output buffer malloced once. That is how the deflate() and inflate() interface was designed to be used. You can look at http://zlib.net/zlib_how.html for a well-documented example of how to use the interface.

By the way, there is a deflatePending() function in the latest version of zlib (1.2.6) that lets you know how much output deflate() has waiting to deliver.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...