Create an account

Very important

  • To access the important data of the forums, you must be active in each forum and especially in the leaks and database leaks section, send data and after sending the data and activity, data and important content will be opened and visible for you.
  • You will only see chat messages from people who are at or below your level.
  • More than 500,000 database leaks and millions of account leaks are waiting for you, so access and view with more activity.
  • Many important data are inactive and inaccessible for you, so open them with activity. (This will be done automatically)


Thread Rating:
  • 542 Vote(s) - 3.54 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Why is this inline assembly not working with a separate asm volatile statement for each instruction?

#1
For the the following code:

long buf[64];

register long rrax asm ("rax");
register long rrbx asm ("rbx");
register long rrsi asm ("rsi");

rrax = 0x34;
rrbx = 0x39;

__asm__ __volatile__ ("movq $buf,%rsi");
__asm__ __volatile__ ("movq %rax, 0(%rsi);");
__asm__ __volatile__ ("movq %rbx, 8(%rsi);");

printf( "buf[0] = %lx, buf[1] = %lx!\n", buf[0], buf[1] );

I get the following output:

buf[0] = 0, buf[1] = 346161cbc0!

while it should have been:

buf[0] = 34, buf[1] = 39!

Any ideas why it is not working properly, and how to solve it?
Reply

#2
The compiler uses registers, and it may write over the values you have put in them.<br>

In this case, the compiler probably uses the `rbx` register after the `rrbx` assignment and before the inline assembly section.

In general, you shouldn't expect registers to keep their values after and between inline assembly code sequences.
Reply

#3
Slightly off-topic but I'd like to follow up a bit on gcc inline assembly.

The (non-)need for `__volatile__` comes from the fact that GCC _optimizes_ inline assembly. GCC inspects the assembly statement for side effects / prerequisites, and if it finds them not to exist it may choose to move the assembly instruction around or even decide to _remove_ it. All `__volatile__` does is to tell the compiler "stop caring and put this right there".

Which is usually not what you really want.

This is where the need for _constraints_ come in. The name is overloaded and actually used for different things in GCC inline assembly:

- constraints specify input / output operands used in the `asm()` block
- constraints specify the "clobber list", which details what "state" (registers, condition codes, memory) are affected by the `asm()`.
- constraints specify classes of operands (registers, addresses, offsets, constants, ...)
- constraints declare associations / bindings between assembler entities and C/C++ variables / expressions

In many cases, developers _abuse_ `__volatile__` because they noticed their code either being moved around or even disappearing without it. If this happens, it's usually rather a sign that the developer has attempted _not_ to tell GCC about side effects / prerequisites of the assembly. For example, this buggy code:

register int foo __asm__("rax") = 1234;
register int bar __adm__("rbx") = 4321;

asm("add %rax, %rbx");
printf("I'm expecting 'bar' to be 5555 it is: %d\n", bar);

It's got several bugs:

- for one, it only compiles due to a gcc bug (!). Normally, to write register names in inline assembly, double `%%` are needed, but in the above if you actually specify them you get a compiler/assembler error, `/tmp/ccYPmr3g.s:22: Error: bad register name '%%rax'`.
- second, it's not telling the compiler when and where you need/use the variables. Instead, it _assumes_ the compiler honours `asm()` literally. That might be true for Microsoft Visual C++ but is _not the case_ for gcc.

If you compile it _without_ optimization, it creates:<pre>0000000000400524 <main>:
[ ... ]
400534: b8 d2 04 00 00 mov $0x4d2,%eax
400539: bb e1 10 00 00 mov $0x10e1,%ebx
40053e: 48 01 c3 add %rax,%rbx
400541: 48 89 da mov %rbx,%rdx
400544: b8 5c 06 40 00 mov $0x40065c,%eax
400549: 48 89 d6 mov %rdx,%rsi
40054c: 48 89 c7 mov %rax,%rdi
40054f: b8 00 00 00 00 mov $0x0,%eax
400554: e8 d7 fe ff ff callq 400430 <printf@plt>
[...]</pre>
You can find your `add` instruction, and the initializations of the two registers, and it'll print the expected. If, on the other hand, you crank optimization up, something else happens:<pre>0000000000400530 <main>:
400530: 48 83 ec 08 sub $0x8,%rsp
400534: 48 01 c3 add %rax,%rbx
400537: be e1 10 00 00 mov $0x10e1,%esi
40053c: bf 3c 06 40 00 mov $0x40063c,%edi
400541: 31 c0 xor %eax,%eax
400543: e8 e8 fe ff ff callq 400430 <printf@plt>
[ ... ]</pre>
Your initializations of both the "used" registers are no longer there. The compiler discarded them because nothing it could see was using them, and while it kept the assembly instruction it put it _before_ any use of the two variables. It's there but it does nothing (Luckily actually ... if `rax` / `rbx` _had been in use_ who can tell what'd have happened ...).

**And the reason for that is that you haven't actually _told_ GCC that the assembly is using these registers / these operand values.** This has nothing whatsoever to do with `volatile` but all with the fact you're using a constraint-free `asm()` expression.

The way to do this _correctly_ is via constraints, i.e. you'd use:

int foo = 1234;
int bar = 4321;

asm("add %1, %0" : "+r"(bar) : "r"(foo));
printf("I'm expecting 'bar' to be 5555 it is: %d\n", bar);

This tells the compiler that the assembly:

1. has one argument in a register, `"+r"(...)` that both needs to be initialized before the assembly statement, and is modified by the assembly statement, and associate the variable `bar` with it.
1. has a second argument in a register, `"r"(...)` that needs to be initialized before the assembly statement and is treated as readonly / not modified by the statement. Here, associate `foo` with that.

Notice no register assignment is specified - the compiler chooses that depending on the variables / state of the compile. The (optimized) output of the above:<pre>0000000000400530 <main>:
400530: 48 83 ec 08 sub $0x8,%rsp
400534: b8 d2 04 00 00 mov $0x4d2,%eax
400539: be e1 10 00 00 mov $0x10e1,%esi
40053e: bf 4c 06 40 00 mov $0x40064c,%edi
400543: 01 c6 add %eax,%esi
400545: 31 c0 xor %eax,%eax
400547: e8 e4 fe ff ff callq 400430 <printf@plt>
[ ... ]</pre>GCC inline assembly constraints are _almost always necessary_ in some form or the other, but there can be multiple possible ways of describing the same requirements to the compiler; instead of the above, you could also write:

asm("add %1, %0" : "=r"(bar) : "r"(foo), "0"(bar));

This tells gcc:

1. the statement has an output operand, the variable `bar`, that after the statement will be found in a register, `"=r"(...)`
1. the statement has an input operand, the variable `foo`, which is to be placed into a register, `"r"(...)`
1. operand zero is also an input operand and to be initialized with `bar`

Or, again an alternative:

asm("add %1, %0" : "+r"(bar) : "g"(foo));

which tells gcc:

1. _bla_ (yawn - same as before, `bar` both input/output)
1. the statement has an input operand, the variable `foo`, which the statement doesn't care whether it's in a register, in memory or a compile-time constant (that's the `"g"(...)` constraint)

The result is different from the former:<pre>0000000000400530 <main>:
400530: 48 83 ec 08 sub $0x8,%rsp
400534: bf 4c 06 40 00 mov $0x40064c,%edi
400539: 31 c0 xor %eax,%eax
40053b: be e1 10 00 00 mov $0x10e1,%esi
400540: 81 c6 d2 04 00 00 add $0x4d2,%esi
400546: e8 e5 fe ff ff callq 400430 <printf@plt>
[ ... ]</pre>because now, GCC _has actually figured out_ `foo` _is a compile-time constant and simply embedded the value in the_ `add` _instruction_ ! Isn't that neat ?

Admittedly, this is complex and takes getting used to. The advantage is that _letting the compiler choose_ which registers to use for what operands allows optimizing the code overall; if, for example, an inline assembly statement is used in a macro and/or a `static inline` function, the compiler can, depending on the calling context, choose different registers at different instantiations of the code. Or if a certain value is compile-time evaluatable / constant in one place but not in another, the compiler can tailor the created assembly for it.

Think of GCC inline assembly constraints as kind of "extended function prototypes" - they tell the compiler what types and locations for arguments / return values are, plus a bit more. If you don't specify these constraints, your inline assembly is creating the analogue of functions that operate on global variables/state only - which, as we probably all agree, are rarely ever doing exactly what you intended.
Reply

#4
You clobber memory but don't tell GCC about it, so GCC can cache values in `buf` across assembly calls. If you want to use inputs and outputs, tell GCC about everything.

__asm__ (
"movq %1, 0(%0)\n\t"
"movq %2, 8(%0)"
: /* Outputs (none) */
: "r"(buf), "r"(rrax), "r"(rrbx) /* Inputs */
: "memory"); /* Clobbered */

You also generally want to let GCC handle most of the `mov`, register selection, etc -- even if you explicitly constrain the registers (rrax is stil `%rax`) let the information flow through GCC or you will get unexpected results.

# `__volatile__` is wrong.

The reason `__volatile__` exists is so you can guarantee that the compiler places your code exactly where it is... which is a *completely unnecessary* guarantee for this code. It's necessary for implementing advanced features such as memory barriers, but almost completely worthless if you are only modifying memory and registers.

GCC already knows that it can't move this assembly after `printf` because the `printf` call accesses `buf`, and `buf` could be clobbered by the assembly. GCC already knows that it can't move the assembly before `rrax=0x39;` because `rax` is an input to the assembly code. So what does `__volatile__` get you? Nothing.

If your code does not work without `__volatile__` then there is an error in the code which should be **fixed** instead of just adding `__volatile__` and hoping that makes everything better. The `__volatile__` keyword is not magic and should not be treated as such.

**Alternative fix:**

Is `__volatile__` necessary for your original code? No. Just mark the inputs and clobber values correctly.

/* The "S" constraint means %rsi, "b" means %rbx, and "a" means %rax
The inputs and clobbered values are specified. There is no output
so that section is blank. */
rsi = (long) buf;
__asm__ ("movq %%rax, 0(%%rsi)" : : "a"(rrax), "S"(rssi) : "memory");
__asm__ ("movq %%rbx, 0(%%rsi)" : : "b"(rrbx), "S"(rrsi) : "memory");

Why `__volatile__` doesn't help you here:

rrax = 0x34; /* Dead code */

GCC is well within its rights to completely delete the above line, since the code in the question above claims that it never uses `rrax`.

# A clearer example

long global;
void store_5(void)
{
register long rax asm ("rax");
rax = 5;
__asm__ __volatile__ ("movq %%rax, (global)");
}

The disassembly is more or less as you expect it at `-O0`,

movl $5, %rax
movq %rax, (global)

But with optimization off, you can be fairly sloppy about assembly. Let's try `-O2`:

movq %rax, (global)

Whoops! Where did `rax = 5;` go? It's dead code, since `%rax` is never used in the function — at least as far as GCC knows. GCC doesn't peek inside assembly. What happens when we remove `__volatile__`?

; empty

Well, you might think `__volatile__` is doing you a service by keeping GCC from discarding your precious assembly, but it's just masking the fact that GCC thinks your assembly isn't *doing* anything. GCC thinks your assembly takes no inputs, produces no outputs, and clobbers no memory. You had better straighten it out:

long global;
void store_5(void)
{
register long rax asm ("rax");
rax = 5;
__asm__ __volatile__ ("movq %%rax, (global)" : : : "memory");
}

Now we get the following output:

movq %rax, (global)

Better. But if you tell GCC about the inputs, it will make sure that `%rax` is properly initialized first:

long global;
void store_5(void)
{
register long rax asm ("rax");
rax = 5;
__asm__ ("movq %%rax, (global)" : : "a"(rax) : "memory");
}

The output, with optimizations:

movl $5, %eax
movq %rax, (global)

Correct! And we don't even need to use `__volatile__`.

# Why does `__volatile__` exist?

The primary correct use for `__volatile__` is if your assembly code does something else besides input, output, or clobbering memory. Perhaps it messes with special registers which GCC doesn't know about, or affects IO. You see it a lot in the Linux kernel, but it's misused very often in user space.

The `__volatile__` keyword is very tempting because we C programmers often like to think we're *almost* programming in assembly language already. We're not. C compilers do a lot of data flow analysis — so you need to explain the data flow to the compiler for your assembly code. That way, the compiler can safely manipulate your chunk of assembly just like it manipulates the assembly that it generates.

If you find yourself using `__volatile__` a lot, as an alternative you could write an entire function or module in an assembly file.
Reply



Forum Jump:


Users browsing this thread:
1 Guest(s)

©0Day  2016 - 2023 | All Rights Reserved.  Made with    for the community. Connected through