Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
365 views
in Technique[技术] by (71.8m points)

How to pickle and unpickle to portable string in Python 3

I need to pickle a Python3 object to a string which I want to unpickle from an environmental variable in a Travis CI build. The problem is that I can't seem to find a way to pickle to a portable string (unicode) in Python3:

import os, pickle    

from my_module import MyPickleableClass


obj = {'cls': MyPickleableClass, 'other_stuf': '(...)'}

pickled = pickle.dumps(obj)

# raises TypeError: str expected, not bytes
os.environ['pickled'] = pickled

# raises UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbb (...)
os.environ['pickled'] = pickled.decode('utf-8')

pickle.loads(os.environ['pickled'])

Is there a way to serialize complex objects like datetime.datetime to unicode or to some other string representation in Python3 which I can transfer to a different machine and deserialize?

Update

I have tested the solutions suggested by @kindall, but the pickle.dumps(obj, 0).decode() raises a UnicodeDecodeError. Nevertheless the base64 approach works but it needed an extra decode/encode step. The solution works on both Python2.x and Python3.x.

# encode returns bytes so it needs to be decoded to string
pickled = pickle.loads(codecs.decode(pickled.encode(), 'base64')).decode()

type(pickled)  # <class 'str'>

unpickled = pickle.loads(codecs.decode(pickled.encode(), 'base64'))
question from:https://stackoverflow.com/questions/30469575/how-to-pickle-and-unpickle-to-portable-string-in-python-3

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

pickle.dumps() produces a bytes object. Expecting these arbitrary bytes to be valid UTF-8 text (the assumption you are making by trying to decode it to a string from UTF-8) is pretty optimistic. It'd be a coincidence if it worked!

One solution is to use the older pickling protocol that uses entirely ASCII characters. This still comes out as bytes, but since it is ASCII-only it can be decoded to a string without stress:

pickled = pickle.dumps(obj, 0).decode()

You could also use some other encoding method to encode a binary-pickled object to text, such as base64:

import codecs
pickled = codecs.encode(pickle.dumps(obj), "base64").decode()

Decoding would then be:

unpickled = pickle.loads(codecs.decode(pickled.encode(), "base64"))

Using pickle with protocol 0 seems to result in shorter strings than base64-encoding binary pickles (and abarnert's suggestion of hex-encoding is going to be even larger than base64), but I haven't tested it rigorously or anything. Test it with your data and see.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...