python unpack little endian

Question

Welcome To Ask or Share your Answers For Others

python unpack little endian

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

python unpack little endian

I'm trying to use Python read a binary file. The file is in LSB mode. I import the struct module and use unpack like this:

f=open(sys.argv[1],'rb')
contents= unpack('<I',f.read(4))[0]
print contents
f.close()

The data in the file is 0XC0000500 in LSB mode, and the actual value is 0X000500C0. So you can see the LSB mode's smallest size is per byte.

However, I use a Mac machine, perhaps because of the version of my gcc or machine (I am not for sure. I just read the http://docs.python.org/library/struct.html about the sizeof and sys.bitorder), the result from the above code is X0500C000, so the size of the LSB mode is 2Bytes.

How should I solve this problem?

I will keep digging no matter this question is answered or not, and I will update if I ever get something.

ps: The data file is an ELF file for a 32-bit machine.

pps: Since I am going to read a huge amount of data, and this is a general problem in the reading, so the manual way is not the best for me. Question is still open for answers.

ppps: < means "little-endian,standard size (16 bit)" Now I read this...

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T19:39:43+0000

if the actual value is OXABCD, then the file stores DCBA.

Usually byte order defines order of bytes, not individual bits inside a byte. "xDCxBA" are two bytes (16 bits). If you swap the bytes; all possible results are:

>>> "0X%04X" % struct.unpack("<H", binascii.unhexlify("DCBA"))
'0XBADC'
>>> "0X%04X" % struct.unpack(">H", binascii.unhexlify("DCBA"))
'0XDCBA'

Here's how 0xabcd looks like in little/big-endian format:

>>> struct.pack('<H', 0xabcd)
'xcdxab'
>>> struct.pack('>H', 0xabcd)
'xabxcd'

To get 0XABCD from "xDCxBA" you need swap half-bytes (4-bits). It seems unusual.

Since I am going to read a huge amount of data

You could use array module to read multiple values at once. It uses the same type format as the struct module.

< means "little-endian,standard size (16 bit)"

If you use <> with the struct module then standard sizes are fixed and independent of anything. Standard size depends only on the format character. In particular '<H' is always 2 bytes (16 bits), '<I' is always 4 bytes (32 bits). Only @ prefix uses native sizes.

Old answer

^{leave it here for the comments to make sense}

You could read it as 2 bytes values and convert to int manually:

>>> hi, lo = struct.unpack("<HH", "x05x00xC0x00")
>>> n = (hi << 16) | lo
>>> n
327872
>>> "0X%08X" % n
'0X000500C0'

Categories

python unpack little endian

python unpack little endian

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Old answer

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags