Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
750 views
in Technique[技术] by (71.8m points)

arrays - How to deserialize dodgy JSON (with improperly quoted strings, and missing brackets)?

I am having to parse (and ultimately reserialize) some dodgy JSON. it looks like this:

{
  name: "xyz",
  id: "29573f59-85fb-4d06-9905-01a3acb2cdbd",
  status: "astatus",
  color: colors["Open"]
},
{
  name: "abc",
  id: "29573f59-85fb-4d06-9905-01a3acb2cdbd",
  status: "astatus",
  color: colors["Open"]
}

There are a number of problems here - starting with the most severe.

  1. color: colors["Open"]

    WTF even is that? If I drop 'colors' then I can get an array of strings out but I can't tweak to work out of the box.

  2. It is an array without square brackets. I can fix this by wrapping in them. But is there a way to support out of the box?

  3. Properties have no quotes. Deserializing is fine for these.. but reserializing is just no dice.

Any suggestions of handling both in and out of this structure?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Answering your questions #1 - #3 in order:

  1. Json.NET does not support reading dodgy property values in the form colors["Open"] (which, as you correctly note, violates the JSON standard).

    Instead, you will need to manually fix these values, e.g. through some sort of Regex:

    var regex = new Regex(@"(colors[)(.*)(])");
    var fixedJsonString = regex.Replace(jsonString, 
        m => string.Format(@"""{0}{1}{2}""", m.Groups[1].Value, m.Groups[2].Value.Replace(""", """), m.Groups[3].Value));
    

    This changes the color property values into properly escaped JSON strings:

    color: "colors["Open"]"
    

    Json.NET does, however, have the capability to write dodgy property values by calling JsonWriter.WriteRawValue() from within a custom JsonConverter.

    Define the following converter:

    public class RawStringConverter : JsonConverter
    {
        public override bool CanConvert(Type objectType)
        {
            return objectType == typeof(string);
        }
    
        public override bool CanRead { get { return false; } }
    
        public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
        {
            throw new NotImplementedException();
        }
    
        public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer)
        {
            var s = (string)value;
            writer.WriteRawValue(s);
        }
    }
    

    Then define your RootObject as follows:

    public class RootObject
    {
        public string name { get; set; }
        public string id { get; set; }
        public string status { get; set; }
    
        [JsonConverter(typeof(RawStringConverter))]
        public string color { get; set; }
    }
    

    Then, when re-serialized, you will get the original dodgy values in your JSON.

  2. Support for deserializing comma-delimited JSON without outer brackets will be in the next release of Json.NET after 10.0.3. see Issue 1396 and Issue 1355 for details. You will need to set JsonTextReader.SupportMultipleContent = true to make it work.

    In the meantime, as a workaround, you could grab ChainedTextReader and public static TextReader Extensions.Concat(this TextReader first, TextReader second) from the answer to How to string multiple TextReaders together? by Rex M and surround your JSON with brackets [ and ].

    Thus you would deserialize your JSON as follows:

    List<RootObject> list;
    using (var reader = new StringReader("[").Concat(new StringReader(fixedJsonString)).Concat(new StringReader("]")))
    using (var jsonReader = new JsonTextReader(reader))
    {
        list = JsonSerializer.CreateDefault().Deserialize<List<RootObject>>(jsonReader);
    }
    

    (Or you could just manually surround your JSON string with [ and ], but I prefer solutions that don't involve copying possibly large strings.)

    Re-serializing a root collection without outer braces is possible if you serialize each item individually using its own JsonTextWriter with CloseOutput = false. You can also manually write a , between each serialized item to the underlying TextWriter shared by every JsonTextWriter.

  3. Serializing JSON property names without a surrounding quote character is possible if you set JsonTextWriter.QuoteName = false.

    Thus, to re-serialize your List<RootObject> without quoted property names or outer braces, do:

    var sb = new StringBuilder();
    bool first = true;
    using (var textWriter = new StringWriter(sb))
    {
        foreach (var item in list)
        {
            if (!first)
            {
                textWriter.WriteLine(",");
            }
            first = false;
            using (var jsonWriter = new JsonTextWriter(textWriter) { QuoteName = false, Formatting = Formatting.Indented, CloseOutput = false })
            {
                JsonSerializer.CreateDefault().Serialize(jsonWriter, item);
            }
        }
    }
    
    var reserializedJson = sb.ToString();
    

Sample .Net fiddle showing all this in action.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...