Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
539 views
in Technique[技术] by (71.8m points)

serialization - SerializationException when serializing lots of objects in .NET

I'm running into problems serializing lots of objects in .NET. The object graph is pretty big with some of the new data sets being used, so I'm getting:

System.Runtime.Serialization.SerializationException
"The internal array cannot expand to greater than Int32.MaxValue elements."

Has anyone else hit this limit? How have you solved it?

It would be good if I can still use the built in serialization mechanism if possible, but it seems like have to just roll my own (and maintain backwards compatibility with the existing data files)

The objects are all POCO and are being serialized using BinaryFormatter. Each object being serialized implements ISerializable to selectively serialize its members (some of them are recalculated during loading).

It looks like this an open issue for MS (details here), but it's been resolved as Wont Fix. The details are (from the link):

Binary serialization fails for object graphs with more than ~13.2 million objects. The attempt to do so causes an exception in ObjectIDGenerator.Rehash with a misleading error message referencing Int32.MaxValue.

Upon examination of ObjectIDGenerator.cs in the SSCLI source code, it appears that larger object graphs could be handled by adding additional entries into the sizes array. See the following lines:

// Table of prime numbers to use as hash table sizes. Each entry is the
// smallest prime number larger than twice the previous entry.
private static readonly int[] sizes = {5, 11, 29, 47, 97, 197, 397,
797, 1597, 3203, 6421, 12853, 25717, 51437, 102877, 205759, 
411527, 823117, 1646237, 3292489, 6584983};

However, it would be nice if serialization worked for any reasonable size of the object graph.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I tried reproducing the problem, but the code just takes forever to run even when each of the 13+ million objects is only 2 bytes. So I suspect you could not only fix the problem, but also significantly improve performance if you pack your data a little better in your custom ISerialize implementations. Don't let the serializer see so deep into your structure, but cut it off at the point where your object graph blows up into hundreds of thousands of array elements or more (because presumably if you have that many objects, they're pretty small or you wouldn't be able to hold them in memory anyway). Take this example, which allows the serializer to see classes B and C, but manually manages the collection of class A:

class Program
{
    static void Main(string[] args)
    {
        C c = new C(8, 2000000);
        System.Runtime.Serialization.Formatters.Binary.BinaryFormatter bf = new System.Runtime.Serialization.Formatters.Binary.BinaryFormatter();
        System.IO.MemoryStream ms = new System.IO.MemoryStream();
        bf.Serialize(ms, c);
        ms.Seek(0, System.IO.SeekOrigin.Begin);
        for (int i = 0; i < 3; i++)
            for (int j = i; j < i + 3; j++)
                Console.WriteLine("{0}, {1}", c.all[i][j].b1, c.all[i][j].b2);
        Console.WriteLine("=====");
        c = null;
        c = (C)(bf.Deserialize(ms));
        for (int i = 0; i < 3; i++)
            for (int j = i; j < i + 3; j++)
                Console.WriteLine("{0}, {1}", c.all[i][j].b1, c.all[i][j].b2);
        Console.WriteLine("=====");
    }
}

class A
{
    byte dataByte1;
    byte dataByte2;
    public A(byte b1, byte b2)
    {
        dataByte1 = b1;
        dataByte2 = b2;
    }

    public UInt16 GetAllData()
    {
        return (UInt16)((dataByte1 << 8) | dataByte2);
    }

    public A(UInt16 allData)
    {
        dataByte1 = (byte)(allData >> 8);
        dataByte2 = (byte)(allData & 0xff);
    }

    public byte b1
    {
        get
        {
            return dataByte1;
        }
    }

    public byte b2
    {
        get
        {
            return dataByte2;
        }
    }
}

[Serializable()]
class B : System.Runtime.Serialization.ISerializable
{
    string name;
    List<A> myList;

    public B(int size)
    {
        myList = new List<A>(size);

        for (int i = 0; i < size; i++)
        {
            myList.Add(new A((byte)(i % 255), (byte)((i + 1) % 255)));
        }
        name = "List of " + size.ToString();
    }

    public A this[int index]
    {
        get
        {
            return myList[index];
        }
    }

    #region ISerializable Members

    public void GetObjectData(System.Runtime.Serialization.SerializationInfo info, System.Runtime.Serialization.StreamingContext context)
    {
        UInt16[] packed = new UInt16[myList.Count];
        info.AddValue("name", name);
        for (int i = 0; i < myList.Count; i++)
        {
            packed[i] = myList[i].GetAllData();
        }
        info.AddValue("packedData", packed);
    }

    protected B(System.Runtime.Serialization.SerializationInfo info, System.Runtime.Serialization.StreamingContext context)
    {
        name = info.GetString("name");
        UInt16[] packed = (UInt16[])(info.GetValue("packedData", typeof(UInt16[])));
        myList = new List<A>(packed.Length);
        for (int i = 0; i < packed.Length; i++)
            myList.Add(new A(packed[i]));
    }

    #endregion
}

[Serializable()]
class C
{
    public List<B> all;
    public C(int count, int size)
    {
        all = new List<B>(count);
        for (int i = 0; i < count; i++)
        {
            all.Add(new B(size));
        }
    }
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...