Wednesday, May 27, 2009

Code gems - Absolute value function

Most of the time, little bits of code are straight-forward. But every machine I've worked on has little bits of code sequences that are very clever. Sometimes the code has instructions that are rarely seen and other times the instructions may be in odd ways. The older the architecture or the more complex the instruction set, the more chance you have of finding these little gems.

This post is about the ABS function in the Intel x86 architecture.

One of the simpler functions to code would have to be the absolute value function. Given the value X, the absolute value would be defined as

int abs1( int x)
if( x < 0) x= -x;

Pretty simple. This would easily translate into x86 assembler like so:
(Assume register eax has the value to be operated on in these examples)
(I have left out the prologue and epilogue from the function which wouldn't exist if the function were in-lined)
(The numbers in parens are the number of code bytes)

TEST EAX,EAX ; (2) compare eax with zero
JGE SKIP1 ; (2) jump if eax >= zero
NEG EAX ; (2) negate eax

While this is not a lot of code, it has a jump instruction which on many processors can be expensive. Some recent processors can cut down on this expense if they can predict the jump, but it would be nice to find a solution with no jumps.

Let's look at a different way of doing this:

int abs2( int x )
int temp= (x>>31); // assumes a 32 bit integer
return (x ^ temp ) - temp;

It helps to remember that the expression "x>>31" does an arithmetic right shift of x by 31 bits and that "x ^ temp" means "x exclusive-or temp". The value of temp has either the value of 0 if x is positive or 0xFFFFFFFF (or -1) if x is negative.

At this point, you may want to remember that in the 2-complement notation that -x = ~x+1.

So, if x is positive, temp will be 0 and x^temp will just be x and the function returns x.

If x is negative, temp will be -1, x^temp would be the same at ~x so the expression is the same as ~x - (-1) or ~x+1 which is the same as -x.

The abs2 function won't generate great code, but it will avoid any jumps:

MOV ECX,EAX ; (2) copy x to ecx
SAR ECX,1FH ; (3) sign extend ecx thru all
; of ecx ->temp
XOR EAX,ECX ; (2) x ^ temp
SUB EAX,ECX ; (2) (x ^ temp) - temp

Not too bad, but even this can be improved upon:

CDQ ; (1) extend the sign of eax
; into edx (temp)
XOR EAX,EDX ; (2) exclusive-xor of temp into x
SUB EAX,EDX ; (2) subtract temp from (x^temp)

The CDQ instruction extends the sign of EAX into EDX. It is a 1 byte instruction that dates back to at least the 8086.

If your numbers are in signed-magnitude or 1-complement form, then the sign bit just needs to be turned off. Floating point formats typically have an explicit sign bit, so the absolute function is easier on these formats. But these formats have other corner cases to worry about like two forms of zero which makes checking for equality a bit tricky.

I found this implementation buried in ntdll:

MOV EAX,ECX ; (2) make copy of X into EAX
NEG EAX ; (2) negate X
TEST ECX,ECX ; (2) what sign does X have?
CMOVNS EAX,ECX ; (3) conditional move X to EAX if
; X is positive

This code also has the feature that there are no jumps, but it takes more code. And even though the CMOV instructions have been around for more than 10 years on Intel processors, there seems to be some question as to whether or not they improve performance. It would be interesting to know how this code was generated.

It is worth noting that there is one negative value that the absolute function with return a negative value for. This value is the most negative value e.g. 0x80000000. This could cause problems if the code assumes that values that have gone thru the absolute function are in the positive number range.


No comments:

Post a Comment