Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
204 views
in Technique[技术] by (71.8m points)

python - combine duplicate keys in json

i have a json that looks like this:

{
  "course1": [
    {
      "courseName": "test",
      "section": "123",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    }
  ],
  "course2": [
    {
      "courseName": "test",
      "section": "456",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    }
  ],
  "course2": [
    {
      "courseName": "test",
      "section": "789",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    }
  ],
  "course2": [
    {
      "courseName": "test",
      "section": "1011",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    }
  ],
  "course3": [
    {
      "courseName": "test",
      "section": "1213",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    }
  ],
  "course3": [
    {
      "courseName": "test",
      "section": "1415",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    }
  ]
}

and i want to combine any block/object/list (i don't know what it called), that they have the same key value. like this:

{
  "course1": [
    {
      "courseName": "test",
      "section": "123",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    }
  ],
  "course2": [
    {
      "courseName": "test",
      "section": "456",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    },
    {
      "courseName": "test",
      "section": "789",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    },
    {
      "courseName": "test",
      "section": "1011",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    }
  ],
  "course3": [
    {
      "courseName": "test",
      "section": "1213",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    },
    {
      "courseName": "test",
      "section": "1415",
      "academicHours": "3",
      "day1": "1",
      "room1": "0145 03 1 B 015"
    }
  ]
}

how can i do this using regular expression in python? or any regular expression query?

also, i tried to use json.dumps() and work my way from there but for some reason when i use it with any json that contains Arabic characters it freaks out and messes up the whole thing. so i'm stuck with regular expression unfortunately.

and thank you for your help :)

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

stdlib json offers a hook to allow decoding objects with duplicate keys. This simple "extend" hook should work for your example data:

def myhook(pairs):
    d = {}
    for k, v in pairs:
        if k not in d:
          d[k] = v
        else:
          d[k] += v
    return d

mydata = json.loads(bad_json, object_pairs_hook=myhook)

Although there's nothing in the JSON specification to disallow duplicate keys, it SHOULD probably be avoided in the first place:

1.1. Conventions Used in This Document

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

...

  1. Objects

An object structure is represented as a pair of curly brackets surrounding zero or more name/value pairs (or members). A name is a string. A single colon comes after each name, separating the name from the value. A single comma separates a value from a following name. The names within an object SHOULD be unique.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...