Create an account

Very important

  • To access the important data of the forums, you must be active in each forum and especially in the leaks and database leaks section, send data and after sending the data and activity, data and important content will be opened and visible for you.
  • You will only see chat messages from people who are at or below your level.
  • More than 500,000 database leaks and millions of account leaks are waiting for you, so access and view with more activity.
  • Many important data are inactive and inaccessible for you, so open them with activity. (This will be done automatically)


Thread Rating:
  • 650 Vote(s) - 3.41 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Fast Division on GCC/ARM

#1
As far as I know most compilers will do fast division by multiplying and then bit shifting to the right. For instance, if you check [this SO thread][1] it says that when you ask the Microsoft compiler to do division by 10 it will multiply the dividend by 0x1999999A (which is 2^32/10) and then divide the result by 2^32 (using 32 shifts to the right).

(Editor's note: that linked answer was wrong until. Compilers don't do that because it's not exact for all inputs. Compilers do multiply and shift, but with a more involved way of determining the magic constant and shift counts:

[To see links please register here]

)

---

So far so good.

Once I tested the same division by 10 on ARM using GCC, though, the compiler did something slightly different. First it multiplied the dividend by 0x66666667 (2^34/10), then it divided the result by 2^34. Thus far it's the same as Microsoft, except using a higher multiplier. After that, however, it subtracted (dividend/2^31) from the result.

**My question**: why on the ARM version there's that extra subtraction? Can you give me an numeric example where without that subtraction the result will be wrong?

If you want to check the generated code, it's below (with my comments):

ldr r2, [r7, #4] @--this loads the dividend from memory into r2
movw r3, #:lower16:1717986919 @--moves the lower 16 bits of the constant
movt r3, #:upper16:1717986919 @--moves the upper 16 bits of the constant
smull r1, r3, r3, r2 @--multiply long, put lower 32 bits in r1, higher 32 in r3
asr r1, r3, #2 @--r3>>2, then store in r1 (effectively >>34, since r3 was higher 32 bits of multiplication)
asr r3, r2, #31 @--dividend>>31, then store in r3
rsb r3, r3, r1 @--r1 - r3, store in r3
str r3, [r7, #0] @--this stores the result in memory (from r3)


[1]:

[To see links please register here]

Reply

#2
`x SAR 31` is `0xffffffff` (-1) for negative values of `x`, and `0x00000000` for positive values.

So the `rsb`is subtracting -1 from the result (which is the same as adding 1) if the dividend was negative.

Let's say your dividend is `-60`. With just the multiply and shift you'd get the result `-7`, so it subtracts -1 to get the expected result of `-6`.
Reply

#3
> After that, however, it subtracted (dividend/2^31) from the result.

Actually, it subtracts `dividend >> 31`, which is `-1` for negative `dividend`, and 0 for non-negative dividend, when right-shifting negative integers is arithmetic right-shift (and `int` is 32 bits wide).

0x6666667 = (2^34 + 6)/10

So for `x < 0`, we have, writing `x = 10*k + r` with `-10 < r <= 0`,

0x66666667 * (10*k+r) = (2^34+6)*k + (2^34 + 6)*r/10 = 2^34*k + 6*k + (2^34+6)*r/10

Now, arithmetic right shift of negative integers yields the floor of `v / 2^n`, so

(0x66666667 * x) >> 34

results in

k + floor((6*k + (2^34+6)*r/10) / 2^34)

So we need to see that

-2^34 < 6*k + (2^34+6)*r/10 < 0

The right inequality is easy, both `k` and `r` are non-positive, and not both are 0.

For the left inequality, a bit more analysis is needed.

r >= -9

so the absolute value of `(2^34+6)*r/10` is at most `2^34+6 - (2^34+6)/10`.

|k| <= 2^31/10,

so `|6*k| <= 3*2^31/5`.

And it remains to verify that

6 + 3*2^31/5 < (2^34+6)/10
1288490194 < 1717986919

Yup, true.
Reply



Forum Jump:


Users browsing this thread:
1 Guest(s)

©0Day  2016 - 2023 | All Rights Reserved.  Made with    for the community. Connected through