Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
996 views
in Technique[技术] by (71.8m points)

c - Why is this inline assembly not working with a separate asm volatile statement for each instruction?

For the the following code:

long buf[64];

register long rrax asm ("rax");
register long rrbx asm ("rbx");
register long rrsi asm ("rsi");

rrax = 0x34;
rrbx = 0x39;

__asm__ __volatile__ ("movq $buf,%rsi");
__asm__ __volatile__ ("movq %rax, 0(%rsi);");
__asm__ __volatile__ ("movq %rbx, 8(%rsi);");

printf( "buf[0] = %lx, buf[1] = %lx!
", buf[0], buf[1] );

I get the following output:

buf[0] = 0, buf[1] = 346161cbc0!

while it should have been:

buf[0] = 34, buf[1] = 39!

Any ideas why it is not working properly, and how to solve it?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You clobber memory but don't tell GCC about it, so GCC can cache values in buf across assembly calls. If you want to use inputs and outputs, tell GCC about everything.

__asm__ (
    "movq %1, 0(%0)
"
    "movq %2, 8(%0)"
    :                                /* Outputs (none) */
    : "r"(buf), "r"(rrax), "r"(rrbx) /* Inputs */
    : "memory");                     /* Clobbered */

You also generally want to let GCC handle most of the mov, register selection, etc -- even if you explicitly constrain the registers (rrax is stil %rax) let the information flow through GCC or you will get unexpected results.

__volatile__ is wrong.

The reason __volatile__ exists is so you can guarantee that the compiler places your code exactly where it is... which is a completely unnecessary guarantee for this code. It's necessary for implementing advanced features such as memory barriers, but almost completely worthless if you are only modifying memory and registers.

GCC already knows that it can't move this assembly after printf because the printf call accesses buf, and buf could be clobbered by the assembly. GCC already knows that it can't move the assembly before rrax=0x39; because rax is an input to the assembly code. So what does __volatile__ get you? Nothing.

If your code does not work without __volatile__ then there is an error in the code which should be fixed instead of just adding __volatile__ and hoping that makes everything better. The __volatile__ keyword is not magic and should not be treated as such.

Alternative fix:

Is __volatile__ necessary for your original code? No. Just mark the inputs and clobber values correctly.

/* The "S" constraint means %rsi, "b" means %rbx, and "a" means %rax
   The inputs and clobbered values are specified.  There is no output
   so that section is blank.  */
rsi = (long) buf;
__asm__ ("movq %%rax, 0(%%rsi)" : : "a"(rrax), "S"(rssi) : "memory");
__asm__ ("movq %%rbx, 0(%%rsi)" : : "b"(rrbx), "S"(rrsi) : "memory");

Why __volatile__ doesn't help you here:

rrax = 0x34; /* Dead code */

GCC is well within its rights to completely delete the above line, since the code in the question above claims that it never uses rrax.

A clearer example

long global;
void store_5(void)
{
    register long rax asm ("rax");
    rax = 5;
    __asm__ __volatile__ ("movq %%rax, (global)");
}

The disassembly is more or less as you expect it at -O0,

movl $5, %rax
movq %rax, (global)

But with optimization off, you can be fairly sloppy about assembly. Let's try -O2:

movq %rax, (global)

Whoops! Where did rax = 5; go? It's dead code, since %rax is never used in the function — at least as far as GCC knows. GCC doesn't peek inside assembly. What happens when we remove __volatile__?

; empty

Well, you might think __volatile__ is doing you a service by keeping GCC from discarding your precious assembly, but it's just masking the fact that GCC thinks your assembly isn't doing anything. GCC thinks your assembly takes no inputs, produces no outputs, and clobbers no memory. You had better straighten it out:

long global;
void store_5(void)
{
    register long rax asm ("rax");
    rax = 5;
    __asm__ __volatile__ ("movq %%rax, (global)" : : : "memory");
}

Now we get the following output:

movq %rax, (global)

Better. But if you tell GCC about the inputs, it will make sure that %rax is properly initialized first:

long global;
void store_5(void)
{
    register long rax asm ("rax");
    rax = 5;
    __asm__ ("movq %%rax, (global)" : : "a"(rax) : "memory");
}

The output, with optimizations:

movl $5, %eax
movq %rax, (global)

Correct! And we don't even need to use __volatile__.

Why does __volatile__ exist?

The primary correct use for __volatile__ is if your assembly code does something else besides input, output, or clobbering memory. Perhaps it messes with special registers which GCC doesn't know about, or affects IO. You see it a lot in the Linux kernel, but it's misused very often in user space.

The __volatile__ keyword is very tempting because we C programmers often like to think we're almost programming in assembly language already. We're not. C compilers do a lot of data flow analysis — so you need to explain the data flow to the compiler for your assembly code. That way, the compiler can safely manipulate your chunk of assembly just like it manipulates the assembly that it generates.

If you find yourself using __volatile__ a lot, as an alternative you could write an entire function or module in an assembly file.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...