Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
461 views
in Technique[技术] by (71.8m points)

c++ - Network byte order and endianness issues

I read on internet that standard byte order for networks is big endian, also known as network byte order. Before transferring data on network, data is first converted to network byte order (big endian).

  1. But can any one please let me know who will take care of this conversion.
  2. Whether the code developer do really worry about this endianness? If yes, can you please let me know the examples where we need to take care (in case of C, C++).
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The first place where the network vs native byte order matters is in creating sockets and specifying the IP address and port number. Those must be in the correct order or you will not end up talking to the correct computer, or you'll end up talking to the incorrect port on the correct computer if you mapped the IP address but not the port number.

The onus is on the programmer to get the addresses in the correct order. There are functions like htonl() that convert from host (h) to network (n) order; l indicates 'long' meaning '4 bytes'; s indicates 'short' meaning '2 bytes' (the names date from an era before 64-bit systems).

The other time it matters is if you are transferring binary data between two computers, either via a network connection correctly set up over a socket, or via a file. With single-byte code sets (SBCS), or UTF-8, you don't have problems with textual data. With multi-byte code sets (MBCS), or UTF-16LE vs UTF-16BE, or UTF-32, you have to worry about the byte order within characters, but the characters will appear one after the other. If you ship a 32-bit integer as 32-bits of data, the receiving end needs to know whether the first byte is the MSB (most significant byte — for big-endian) or the LSB (least significant byte — for little-endian) of the 32-bit quantity. Similarly with 16-bit integers, or 64-bit integers. With floating point, you could run into the additional problem that different computers could use different formats for the floating point, independently of the endianness issue. This is less of a problem than it used to be thanks to IEEE 744.

Note that IBM mainframes use EBCDIC instead of ASCII or ISO 8859-x character sets (at least by default), and the floating point format is not IEEE 744 (pre-dating that standard by a decade or more). These issues, therefore, are crucial to deal with when communicating with the mainframe. The programs at the two ends have to agree with how each end will understand the other. Some protocols define a byte order (e.g. network byte order); others define 'sender makes right' or 'receiver makes right' or 'client makes right' or 'server makes right', placing the conversion workload on different parts of the system.

One advantage of text protocols (especially those using an SBCS) is that they evade the problems of endianness — at the cost of converting text to value and back, but computation is cheap compared to even gigabit networking speeds.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...