Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
363 views
in Technique[技术] by (71.8m points)

java - Deserializing an array that contains some non-deserializable objects using Kryo (salvaging the deserializable parts)

Background

I am attempting to write a Kryo deserialization in such a way that if an array of objects contains some objects that (due to code change) can't be deserialized then those references in the array will become null rather than throwing an exception; allowing the remainder of the object to be salvaged. I have previously been using Java's inbuilt serialization and within that I have been able to achieve this by writing a "known good" integer between each item in the array and then looking for that in the stream if an error occurs to find the start of the next object. This is detailed the question Deserializing an array that contains some non-deserializable objects (salvaging the deserializable parts).

I have now moved over to Kryo serialization for reasons of efficiency and have attempted to recreate this approach, however within Kryo this error recovery seems to work once, but after that it does not recover correctly.

What I've tried

I have attempted to write a known good integer(END_OF_APPLE_MAGIC) between each object in the array of Apples during serialization. During deserialization when a BadApple is found which cannot be deserialized it is replaced by an ErrorApple (analogy is getting weak) and the END_OF_APPLE_MAGIC is searched for to find where to look for the next apple. This works if there is a single BadApple in the array and the BadApple is not the first entry. But fails in assorted ways (see detailed analysis) if more than 1 BadApple is in the array or the first Apple is a BadApple

public class AppleHolder implements Serializable,KryoSerializable{
    static int END_OF_APPLE_MAGIC=1234467895; //if this just "turns up" in the stream we will have a bad day; however this is only the case in recovery mode, so is an acceptable risk

    int numberOfApples=6;
    Apple[] apples=new Apple[numberOfApples];
    double otherData=15;

    //these are just for debug
    int dividers=0; //counts number of times END_OF_APPLE_MAGIC is found
    int problems=0; //counts number of times an apple fails to load
    int badIntegers=0; //counts number of times END_OF_APPLE_MAGIC is looked for and a different Integer is found (I have never seen this happen)

    public AppleHolder(){
        Apple goodApple=new Apple("GoodApple","tastyGood");
        BadApple badApple=new BadApple("BadApple","untastyBad");

        apples[0]=goodApple;
        apples[1]=badApple;
        apples[2]=goodApple;
        apples[3]=goodApple; // multiple references to same object intentional
        apples[4]=goodApple;
        apples[5]=goodApple;


    }


    public void write (Kryo kryo, Output output) {
        for(int i=0;i<apples.length;i++){

            //kryo.writeObject(output, apples[i]);
            kryo.writeClassAndObject(output, apples[i]);
            kryo.writeClassAndObject(output, END_OF_APPLE_MAGIC);
        }

        kryo.writeObject(output,otherData);
    }


    public void read (Kryo kryo, Input input) {
        try{

            apples =new Apple[numberOfApples];
            for(int i=0;i<apples.length;i++){

                try{
                    Object ob=kryo.readClassAndObject(input);
                    apples[i]=(Apple)ob;
                }catch(Exception e){
                    apples[i]=new ErrorApple();
                    problems++;
                }

                //Search for next Apple Boundary (except in recovery mode 
                //it will be the next entry)
                boolean atBoundary=false;
                while (atBoundary==false){ //should probably put a limit on this just in case
                    try{
                        int appleMagic =(Integer)kryo.readClassAndObject(input);
                        if (appleMagic == END_OF_APPLE_MAGIC){
                            atBoundary=true;
                            dividers++;
                        }else{
                            badIntegers++;
                        }
                    }catch(Exception e){
                        //painful byte reading mode only entered in recovery mode; under good 
                        //situations it does not represent an efficiency problem
                        input.skip(1); //consume byte of bad input
                        //Where buffer underflow exceptions occur they occur here
                    }
                }


            }
            otherData = kryo.readObject(input, Double.class);

        }catch(Exception e){
            //something when wrong (always a Buffer underflow so far), print what we have


            for(int i=0;i<apples.length;i++){
                System.out.println(apples[i]);

            }

            throw e;
        }

    }


    public static void main(String[] args)
            throws Exception {

        /*
         * (1) First run serialize()
         * (2) Rename/delete badApple such that it cannot be found for deserialization
         * (3) Run deSerialize(()
         */


        serialize();


        //deSerialize();

    }

    public static void serialize() throws Exception{
        AppleHolder testWrite = new AppleHolder();
        /*FileOutputStream fos = new FileOutputStream("testfile");
        ObjectOutputStream oos = new ObjectOutputStream(fos);
        oos.writeObject(testWrite);
        oos.flush();
        oos.close();
        */

        Kryo kryo = new Kryo();
        Output output = new Output(new FileOutputStream("testfile"));
        kryo.writeObject(output, testWrite);
        output.close();

    }

    public static void deSerialize() throws Exception{
        /*AppleHolder testRead;
        FileInputStream fis = new FileInputStream("testfile");
        ObjectInputStream ois = new ObjectInputStream(fis);
        testRead = (AppleHolder) ois.readObject();
        ois.close();
        */

        Kryo kryo = new Kryo();
        Input input = new Input(new FileInputStream("testfile"));
        AppleHolder testRead = kryo.readObject(input, AppleHolder.class);
        input.close();

        for(int i=0;i<testRead.apples.length;i++){
            System.out.println(testRead.apples[i]);

        }

        System.out.println("otherData: " + testRead.otherData);

    }

}

public class Apple implements Serializable {
    private String propertyOne;
    private String propertyTwo;

    public Apple(){}

    public Apple(String propertyOne, String propertyTwo) {
        this.propertyOne = propertyOne;
        this.propertyTwo = propertyTwo;
        validate();
    }

    private void writeObject(ObjectOutputStream o)
            throws IOException {

        o.writeObject(propertyOne);
        o.writeObject(propertyTwo);
    }

    private void readObject(ObjectInputStream o)
            throws IOException, ClassNotFoundException {

        propertyOne = (String) o.readObject();
        propertyTwo = (String) o.readObject();
        validate();
    }

    private void validate(){
        if(propertyOne == null ||
                propertyOne.length() == 0 ||
                propertyTwo == null ||
                propertyTwo.length() == 0){

            throw new IllegalArgumentException();
        }
    }

    public String getPropertyOne() {
        return propertyOne;
    }

    public String getPropertyTwo() {
        return propertyTwo;
    }

    @Override
    public String toString() {
        return "goodApple";
    }



}

public class BadApple extends Apple {


    public BadApple(){}

    public BadApple(String propertyOne, String propertyTwo) {
        super(propertyOne, propertyTwo);
    }

    @Override
    public String toString() {
        return "badApple";
    }
}

public class ErrorApple extends Apple {


    public ErrorApple(){}

    public ErrorApple(String propertyOne, String propertyTwo) {
        super(propertyOne, propertyTwo);
    }

    @Override
    public String toString() {
        return "errorApple";
    }

}

Question

How can I salvage a Kyro serialized array in which only some of the objects are deserializable? Thereby getting an array with ErrorApple entries for the non deserializable parts. Within my array I have several references to the same object in a single array, it is essential that this is preserved in the deserialization process.

So going into serialization I have

[GoodApple]
[GoodApple]  
[GoodApple]  
[BadApple]
[BadApple]
[GoodApple] 

And coming out of deserialization I want (because badApple has changed and cannot be deserialised

[GoodApple]
[GoodApple]  
[GoodApple]  
[ErrorApple]
[ErrorApple]
[GoodApple]  

I want this to provide a fallback where backwards compatibility cannot be achieved or a 3rd party modification to my program that was previously installed is removed

Detailed Analysis

This section outlines the ways in which the existing program fails.

In general

  • A single BadApple somewhere other than at the first position in the array will function correctly
  • A BadApple at the first position in the array will lead to the next Apple reading correctly then ErrorApples from then on (even for good Apples)
  • Where there are more than 1 BadApple the first good Apple after the second BadApple will read correctly but may be moved forwards in the array and then ErrorApples from then on (even for good Apples). There will be a KryoException: Buffer underflow and there may be null entries at the end of the array.

The inputs and outputs I used are shown below:

    In          Out
    [goodApple] [goodApple]  
    [goodApple] [goodApple]  
    [badApple]  [badApple]  
    [goodApple] [goodApple]  
    [goodApple] [goodApple]  
    [goodApple] [goodApple]  

    In          Out
    [badApple]  [errorApple]  
    [goodApple] [goodApple]  
    [goodApple] [errorApple]  
    [goodApple] [errorApple]  
    [goodApple] [errorApple]  
    [goodApple] [errorApple]  

    In          Out
    [goodApple] [goodApple]  
    [badApple]  [errorApple]  
    [goodApple] [goodApple]  
    [badApple]  [errorApple]  
    [goodApple] [goodApple]  
    [goodApple] [errorApple]  
    KryoException: Buffer underflow. (occures at input.skip(1);)  

    In          Out
    [goodApple] [goodApple]  
    [goodApple] [goodApple]  
    [badApple]  [errorApple]  
    [badApple]  [errorApple]  
    [goodApple] [goodApple]  
    [goodApple] [errorApple]  
    KryoException: Buffer underflow (occures at input.skip(1);)  

    In          Out
    [goodApple] [goodApple]  
    [badApple]  [errorApple]  
    [badApple]  [errorApple]  
    [badApple]  [goodApple]  
    [goodApple] [errorApple]  
    [goodApple] [null]  
    KryoException: Buffer underflow. (occures at input.skip(1);)  

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

A solution I found to this was to write a second recovery serialized file that just holds the file length after each item is added to it. Then if there is a problem deserialising data the program knows where it can find the "next good" value.

However there is a catch, once you have read a byte you can't reread it without starting all over again (.reset() throws an UnsupportedOperationException) and sometimes Kryo starts reading into the next good object before it realises it's choking on a bad object. My solution to this was to use the data in the seperate file to detect how many bytes to read for each object, read them as bytes and then pass that to Kryo to deserialize.

All this has some overhead but the whole thing is still wildly faster than standard Java deserialization for my tests and it could be used only in "recovery mode".

This is demonstrated by the following program which leaves bad values as nulls (but could do anything it wanted).

public class AppleHolder implements Serializable,KryoSerializable{
    int numberOfApples=10;
    Apple[] apples=new Apple[numberOfApples];
    double otherData=15;


    public AppleHolder(){

        BadApple sharedBad=new BadApple(0);

        for(int i=0;i<numberOfApples;i++){
            if (i>3 && i<=6){
                apples[i]=new Apple(i);
            }else{
                apples[i]=new BadApple();
            }
        }

    }


    public void write (Kryo kryo, Output output) {
        int[] recoveryTrack=new int[apples.length+1]; //last one for after the last entry

        for(int i=0;i<apples.length;i++){
            recoveryTrack[i]=output.total();
            kryo.writeClassAndObject(output, apples[i]);
        }
        recoveryTrack[recoveryTrack.length-1]=output.total();

        kryo.writeObject(output,otherData);

        Output outputRecovery;
        try {
            outputRecovery = new Output(new FileOutputStream("testfile.recovery"));
            kryo.writeObject(outputRecovery, recoveryTrack);
            outputRecovery.close();
        } catch (FileNotFoundException ex) {
            //I guess hopefully we won't need the recovery track
            Logger.getLogger(AppleHolder.class.getName()).log(Level.SEVERE, null, ex);
        }


    }


    public void read (Kryo kryo, Input input) {



        int[] readRecoveryTrack=null;
        try {
            Kryo kryoRecovery = new Kryo();
            Input inputRecovery = new Input(new FileInputStream("testfile.recovery"));
            readRecoveryTrack =kryoRecovery.readObject(inputRecovery, int[].class);
            inputRecovery.close();
        } catch (FileNotFoundException ex) {
            Logger.getLogger(AppleHolder.class.getName()).log(Level.SEVERE, null, ex);
        }





        apples=new Apple[numberOfApples];

        for(int j=0;j<apples.length;j++){

            int actualPos=input.total();
            int desiredPos=readRecoveryTrack[j];
            int desiredBytes=readRecoveryTrack[j+1]-readRecoveryTrack[j];

            byte[] bytes=input.readBytes(desiredBytes);

            ByteArrayInputStream byteStream =new  ByteArrayInputStream(bytes);

            try{
                apples[j]=(Apple)kryo.readClassAndObject(new Input(byteStream));
            }catch(Exception e){
                //don't care, leave null
            }




        }

    }

    public static void main(String[] args)
            throws Exception {

        /*
         * (1) First run serialize()
         * (2) Rename/delete badApple such that it cannot be found for deserialization
         * (3) Run deSerialize(()
         */


        serialize();


        //deSerialize();

    }

    public static void serialize() throws Exception{
        AppleHolder testWrite = new AppleHolder();


        Kryo kryo = new Kryo();
        Output output = new Output(new FileOutputStream("testfile"));
        kryo.writeObject(output, testWrite);
        output.close();
    }

    public static void deSerialize() throws Exception{

        Kryo kryo = new Kryo();
        Input input = new Input(new FileInputStream("testfile"));
        AppleHolder testRead = kryo.readObject(input, AppleHolder.class);
        input.close();


        for(int i=0;i<testRead.apples.length;i++){
            System.out.println(testRead.apples[i]);

        }

        System.out.println(testRead.otherData);

    }
}

public class Apple implements Serializable {
    protected int index;

    public Apple(){}

    public Apple(int index) {
        this.index = index;
    }


    @Override
    public String toString() {
        return "goodApple " + index;
    }



}

public class BadApple extends Apple {

    private static final long serialVersionUID = 7;

    public BadApple(){}

    public BadApple(int index){
        super(index);
    }


    @Override
    public String toString() {
        return "badApple " + index;
    }
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...