Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
486 views
in Technique[技术] by (71.8m points)

python - Pandas json_normalize with timestamps as keys

I try to read this JSON data

{
    "values": [
        {
            "1510122047": [
                35.7,
                256
            ]
        },
        {
            "1510125000": [
                41.7,
                7
            ]
        },
        {
            "1510129000": [
                31.7,
                0
            ]
        }
    ]
}

and normalize it into a pandas data frame of this format:

data frame

I tried it with json_normalize but I was not able to get the result I need. Here is what I tried: But it's not quite efficient. I would like to find a solution that works with pandas' built in functions to do this. I'd appreciate ideas!

import pandas
import json

s = """
{"values": [
            {
              "1510122047": [35.7, 256]
            },
            {
              "1510125000": [41.7, 7]
            },
            {
              "1510129000": [31.7, 0]
            }
          ]}
"""

data = json.loads(s)

normalized_data = []
for value in data['values']:
    timestamp = list(value.keys())[0]
    normalized_data.append({'timestamp':timestamp, 'value_1': value[timestamp][0], 'value_2': value[timestamp][1]})

pandas.DataFrame(normalized_data)

Thanks

EDIT

Thanks for your suggestions. Unfortunately none where faster than the solution of this OP. Here is what I did to generate a bigger payload and test for speed: I guess it's the nature of JSON to be slowly for this application.

import pandas
import json
import time

s1 = """{
              "1510122047": [35.7, 256]
            },
            {
              "1510125000": [41.7, 7]
            },
            {
              "1510129000": [31.7, 0]
            }"""

s = """
{"values": [
            {
              "1510122047": [35.7, 256]
            },
            {
              "1510125000": [41.7, 7]
            },
            {
              "1510129000": [31.7, 0]
            },
""" + ",".join([s1]*1000000) + "]}"

data = json.loads(s)

tic = time.time()

normalized_data = []
for value in data['values']:
    timestamp = list(value.keys())[0]
    normalized_data.append({'timestamp':timestamp, 'value_1': value[timestamp][0], 'value_2': value[timestamp][1]})

print(time.time() - tic)
pandas.DataFrame(normalized_data)
question from:https://stackoverflow.com/questions/65672141/pandas-json-normalize-with-timestamps-as-keys

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

This is one approach using a nested comprehension

Ex:

df= pd.DataFrame([[key] + value for item in data['values'] 
                            for key, value in item.items()
                 ], columns=["Timestamp", "Val_1", "Val_2"])
print(df)

Output:

    Timestamp  Val_1  Val_2
0  1510122047   35.7    256
1  1510125000   41.7      7
2  1510129000   31.7      0

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...