Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
222 views
in Technique[技术] by (71.8m points)

How can I stream large arrays in gRPC (c#) fast without extreme memory use?

I want to use gRPC to stream large arrays of data (multiple GB). To do this I send the array in small packages of a few MB each. The problem Im running into is that both server and client side allocates a large number of buffers and the GC is very lazy at cleaning these up. This means that when sending a 2 GB array both client and server side may hold the entire array in memory just as temporary non-GCed buffers.

On the server side, the streaming service looks like the code below. The issue seems to be that the message (DataArrayPart - autogenerated using protobuffers) contains a field Data which is a repeated field of floats. When writing to this field there is no way to directly copy into the internal array inside Data except using the methods Clear and Add. This means that the entire buffer must be cleared and reallocated for each message and the GC is all but eager to clean up these buffers.

I can of course write element by element as in the out commented section below. But this slows thing down quite a bit, about 30%. There are of course a few hacks possible too, such as using reflection to access the underlying array in Data (it is a private member) or force the GC to run. But there must be a better way, or?

    public override async Task StreamRead(StreamRequest request, IServerStreamWriter<global::DataArrayPart> responseStream, ServerCallContext context)
    {
        // Note that the variable data contains the data to be sent. 

        int max_size = 1000_000;
        int n = data.Length;

        float[] arr = new float[max_size];
        var da = new DataArrayPart();
        da.TotalSize = n;

        for (int i = 0; i < n - max_size; i += max_size)
        {
            System.Buffer.BlockCopy(data, 4 * i, arr, 0, 4 * max_size);

            // Copy into the buffer fast
            da.Data.Clear();
            da.Data.Add(arr);

            // Slow copy - but prevents extensive memory use.
            //for (int k = 0; k < arr.Length; k++)
            //    da.Data[k] = arr[k];

            await responseStream.WriteAsync(da);
        }

        // Send the last n - max_size elements. Not implemented.
    }
question from:https://stackoverflow.com/questions/65560253/how-can-i-stream-large-arrays-in-grpc-c-fast-without-extreme-memory-use

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...