c - How to make GCC generate bswap instruction for big endian store without builtins?

Question

Welcome To Ask or Share your Answers For Others

c - How to make GCC generate bswap instruction for big endian store without builtins?

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

c - How to make GCC generate bswap instruction for big endian store without builtins?

Update: This was fixed in GCC 8.1.

I'm working on a function that stores a 64-bit value into memory in big endian format. I was hoping that I could write portable C99 code that works on both little and big endian platforms and have modern x86 compilers generate a bswap instruction automatically without any builtins or intrinsics. So I started with the following function:

#include <stdint.h>

void
encode_bigend_u64(uint64_t value, void *vdest) {
    uint8_t *bytes = (uint8_t *)vdest;
    bytes[0] = value >> 56;
    bytes[1] = value >> 48;
    bytes[2] = value >> 40;
    bytes[3] = value >> 32;
    bytes[4] = value >> 24;
    bytes[5] = value >> 16;
    bytes[6] = value >> 8;
    bytes[7] = value;
}

This works fine for clang which compiles this function to:

bswapq  %rdi
movq    %rdi, (%rsi)
retq

But GCC fails to detect the byte swap. I tried a couple of different approaches but they only made things worse. I know that GCC can detect byte swaps using bitwise-and, shift, and bitwise-or, but why doesn't it work when writing bytes?

Edit: I found the corresponding GCC bug.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-17T02:49:59+0000

This seems to do the trick:

void encode_bigend_u64(uint64_t value, void* dest)
{
  value =
      ((value & 0xFF00000000000000u) >> 56u) |
      ((value & 0x00FF000000000000u) >> 40u) |
      ((value & 0x0000FF0000000000u) >> 24u) |
      ((value & 0x000000FF00000000u) >>  8u) |
      ((value & 0x00000000FF000000u) <<  8u) |      
      ((value & 0x0000000000FF0000u) << 24u) |
      ((value & 0x000000000000FF00u) << 40u) |
      ((value & 0x00000000000000FFu) << 56u);
  memcpy(dest, &value, sizeof(uint64_t));
}

clang with `-O3`

encode_bigend_u64(unsigned long, void*):
        bswapq  %rdi
        movq    %rdi, (%rsi)
        retq

clang with `-O3 -march=native`

encode_bigend_u64(unsigned long, void*):
        movbeq  %rdi, (%rsi)
        retq

gcc with `-O3`

encode_bigend_u64(unsigned long, void*):
        bswap   %rdi
        movq    %rdi, (%rsi)
        ret

gcc with `-O3 -march=native`

encode_bigend_u64(unsigned long, void*):
        movbe   %rdi, (%rsi)
        ret

Tested with clang 3.8.0 and gcc 5.3.0 on http://gcc.godbolt.org/ (so I don't know exactly what processor is underneath (for the -march=native) but I strongly suspect a recent x86_64 processor)

If you want a function which works for big endian architectures too, you can use the answers from here to detect the endianness of the system and add an if. Both the union and the pointer casts versions work and are optimized by both gcc and clang resulting in the exact same assembly (no branches). Full code on godebolt:

int is_big_endian(void)
{
    union {
        uint32_t i;
        char c[4];
    } bint = {0x01020304};

    return bint.c[0] == 1;
}

void encode_bigend_u64_union(uint64_t value, void* dest)
{
  if (!is_big_endian())
    //...
  memcpy(dest, &value, sizeof(uint64_t));
}

Intel? 64 and IA-32 Architectures Instruction Set Reference (3-542 Vol. 2A):

MOVBE—Move Data After Swapping Bytes

Performs a byte swap operation on the data copied from the second operand (source operand) and store the result in the first operand (destination operand). [...]

The MOVBE instruction is provided for swapping the bytes on a read from memory or on a write to memory; thus providing support for converting little-endian values to big-endian format and vice versa.

Categories

c - How to make GCC generate bswap instruction for big endian store without builtins?

c - How to make GCC generate bswap instruction for big endian store without builtins?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

clang with `-O3`

clang with `-O3 -march=native`

gcc with `-O3`

gcc with `-O3 -march=native`

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags

Categories

c - How to make GCC generate bswap instruction for big endian store without builtins?

c - How to make GCC generate bswap instruction for big endian store without builtins?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

clang with -O3

clang with -O3 -march=native

gcc with -O3

gcc with -O3 -march=native

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags

clang with `-O3`

clang with `-O3 -march=native`

gcc with `-O3`

gcc with `-O3 -march=native`