I've often noticed gcc converting multiplications into shifts in the executable. Something similar might happen when multiplying an `int`

and a `float`

. For example, `2 * f`

, might simply increment the exponent of `f`

by 1, saving some cycles. Do the compilers, perhaps if one requests them to do so (e.g. via `-ffast-math`

), in general, do it?

Are compilers generally smart enough to do this, or do I need to do this myself using the `scalb*()`

or `ldexp()/frexp()`

function family?

It's not about compilers or compiler writers not being smart. It's more like obeying standards and producing all the necessary "side effects" such as Infs, Nans, and denormals.

Also it can be about *not* producing other side effects that are not called for, such as reading memory. But I do recognize that it can be faster in some circumstances.

For example, 2 * f, might simply increment the exponent of f by 1, saving some cycles.

This simply isn't true.

First you have too many corner cases such as zero, infinity, Nan, and denormals. Then you have the performance issue.

*The misunderstanding is that incrementing the exponent is not faster than doing a multiplication.*

If you look at the hardware instructions, there is no direct way to increment the exponent. So what you need to do instead is:

- Bitwise convert into integer.
- Increment the exponent.
- Bitwise convert back to floating-point.

There is generally a medium to large latency for moving data between the integer and floating-point execution units. So in the end, this "optimization" becomes much worse than a simple floating-point multiply.

So the reason why the compiler doesn't do this "optimization" is because it isn't any faster.

Common floating-point formats, particularly IEEE 754, do not store the exponent as a simple integer, and treating it as an integer will not produce correct results.

In 32-bit float or 64-bit double, the exponent field is 8 or 11 bits, respectively. The exponent codes 1 to 254 (in float) or 1 to 2046 (in double) do act like integers: If you add one to one of these values and the result is one of these values, then the represented value doubles. However, adding one fails in these situations:

- The initial value is 0 or subnormal. In this case, the exponent field starts at zero, and adding one to it adds 2
^{-126}(in float) or 2^{-1022}(in double) to the number; it does not double the number. - The initial value exceeds 2
^{127}(in float) or 2^{1023}(in double). In this case, the exponent field starts at 254 or 2046, and adding one to it changes the number to a NaN; it does not double the number. - The initial value is infinity or a NaN. In this case, the exponent field starts at 255 or 2047, and adding one to it changes it to zero (and is likely to overflow into the sign bit). The result is zero or a subnormal but should be infinity or a NaN, respectively.

(The above is for positive signs. The situation is symmetric with negative signs.)

As others have noted, some processors do not have facilities for manipulating the bits of floating-point values quickly. Even on those that do, the exponent field is not isolated from the other bits, so you typically cannot add one to it without overflowing into the sign bit in the last case above.

Although some applications can tolerate shortcuts such as neglecting subnormals or NaNs or even infinities, it is rare that applications can ignore zero. Since adding one to the exponent fails to handle zero properly, it is not usable.

On modern CPUs, multiplication typically has one-per-cycle throughput and low latency. If the value is already in a floating point register, there's no way you'll beat that by juggling it around to do integer arithmetic on the representation. If it's in memory to begin with, and if you're assuming neither the current value nor the correct result would be zero, denormal, nan, or infinity, then it *might* be faster to perform something like

```
addl $0x100000, 4(%eax) # x86 asm example
```

to multiply by two; the only time I could see this being beneficial is if you're operating on a whole array of floating-point data that's bounded away from zero and infinity, and scaling by a power of two is the only operation you'll be performing (so you don't have any existing reason to be loading the data into floating point registers).

It may be useful for embedded systems compilers to have special scale-by-power-of-two pseudo-op which could be translated by the code generator in whatever fashion was optimal for the machine in question, since on some embedded processors focusing on the exponent may be an order of magnitude faster than doing a full power-of-two multiplication, but on the embedded micros where multiplication is slowest, a compiler could probably achieve a bigger performance boost by having the floating-point-multiply routine check its arguments at run-time so as to skip over parts of the mantissa that are zero.

Actually, this is what happens in the hardware.

The `2`

is also passed into the FPU as a floating point number, with a mantissa of 1.0 and an exponent of 2^1. For the multiplication, the exponents are added, and the mantissas multiplied.

Given that there is dedicated hardware to handle the complex case (multiplying with values that are not powers of two), and the special case is not handled any worse than it would be using dedicated hardware, there is no point in having additional circuitry and instructions.

A previous Stackoverflow question about multiplication by powers of 2. The consensus, and the actual implementations, proved that unfortunately, there is no current way to be more efficient than standard multiplication.

If you think that multiplying by two means increasing the exponent by 1, think again. Here are the possible cases for IEEE 754 floating-point arithmetic:

Case 1: Infinity and NaN stay unchanged.

Case 2: Floating-point numbers with the largest possible exponent are changed to Infinity by increasing the exponent and setting the mantissa except for the sign bit to zero.

Case 3: Normalised floating-point numbers with exponent less than the maximum possible exponent have their exponent increased by one. Yippee!!!

Case 4: Denormalised floating-point numbers with the highest mantissa bit set have their exponent increased by one, turning them into normalised numbers.

Case 5: Denormalised floating-point numbers with the highest mantissa bit cleared, including +0 and -0, have their mantissa shifted to the left by one bit position, leaving the exponent unchanged.

I very much doubt that a compiler producing integer code handling all these cases correctly will be anywhere as fast as the floating-point built into the processor. And it's only suitable for multiplication by 2.0. For multiplication by 4.0 or 0.5, a whole new set of rules applies. And for the case of multiplication by 2.0, you might try to replace x * 2.0 with x + x, and many compilers do this. That is they do it, because a processor might be able for example to do one addition and one multiplication at the same time, but not one of each kind. So *sometimes* you would prefer x * 2.0, and sometimes x + x, depending on what other operations need doing at the same time.

Similar Questions

Uh..... I have no idea why this simple code snippet isn't working: function increment() { var count = 0; alert(count++); } setInterval(increment, 1000); // test it should display the next number repe

What is the preferred method of writing loops according to efficiency: Way a) /*here I'm hoping that compiler will optimize this code and won't be calling size every time it iterates through this loo

For RSA, how do i calculate the secret exponent? Given p and q the two primes, and phi=(p-1)(q-1), and the public exponent (0x10001), how do i get the secret exponent 'd' ? I've read that i have to do

Why is it faster to call external scala compiler than use the runtime interpreter library? In the code below it takes almost 2s to warm up the interpreter. val out = new PrintStream(new FileOutputStre

Does the Java compiler or the JIT compiler optimize divisions or multiplications by a constant power of two down to bitshifting? For example, are the following two statements optimized to be the same?

Double.isInfinite(1.0 / 0.0); returns true. Double.isInfinite(1.0 / 3.0); but this doesnt, and to my understanding this would be an infinite number as this returns 0.33333_ to infinity. why?

Wondering why this code won't run. It compiles but doesnt show the GUI or anything for some reason please help! It's a program with 2, eventually 3 panels in 1 frame. If i delete everything i've done

Does c++ compiler optimize 0*x? I mean does it convert to 0 or it actually does the multiplication? Thanks

If I've a method Multiply defined as: public static class Experiment { public static int Multiply(int a, int b) { return a * b; } } Then why does the compiler emit this IL: .method public hidebysig s

Why does the compiler decide that 2.3 is double so this code won't compile: decimal x; x = 2.3; // Compilation error - can not convert double to decimal. x = (decimal) 2.3 // O.k. Why the compiler do

upd placed my version in the description at the end I need to convert mantissa and exponent to decimal. This is how I coded that: // long field.Decimal.Mantissa // sbyte field.Decimal.Exponent decimal

Here is my jquery code and it does not work. I really dont know why. Please help and also tell me how can I fix it. $(function () { var panel = function () { var init = function () { console.log('hell

In the following C code I am using OpenMP in a nested loop. Since race condition occurs, I want to perform atomic operations at the end: double mysumallatomic() { double S2 = 0.; #pragma omp parallel

I've been reading Eric Lippert's article about the /optimize command line option of the C# compiler. The article describes what kind of optimizations the compiler performs. However it remains unclear

Why does the Closure Compiler generate different code (using advance option) for the following two functions: var function1 = function() { return 1 * Math.random(); }; window['function1'] = function1

Reading Effective Java, the author mentioned that while(!done) i++; can be optimized by hotspot into if (!done) { while(true) i++ } I am very confused about it. done is usally not a const, why can

I understand how a language can bootstrap itself, but I haven't been able to find much reference on why you should consider bootstrapping. The intuitive answer is that the language you're writing offe

I did some tests on pow(exponent) method. Unfortunately, my math skills are not strong enough to handle the following problem. I'm using this code: BigInteger.valueOf(2).pow(var); Results: var | tim

i used doctrine2 odm in zf2 and i want to make auto increment id,i make calendar id as auto increment but it does not save in database,how i make calendar id auto increment and save in database? here

My mind is boggled. Can anyone tell me why this works with 'mouseenter' but not with 'click'? http://jsfiddle.net/jxUGw/3/ //////THIS DOESNT WORK $('#view_panel').on('click', 'img.main_image', functi

Why does this work: char c = 5; int i = c++; But this throws a compiler error (Invalid Argument to Operator ++/--): int i = argv[0].charAt(0)++;

Here's my code .Basically I trying to return a Map from my intToEnumMap method.But compiler is not allowing this.Why can't the compiler infer the return type for my method? public enum INVALIDTHRESHO

Why there is no concept of near,far & huge pointer in a 32 bit compiler? As far as I understand, programs created on 16 bit 8086 architecture complier can have 1 mb size in which the data segment,

In C++ primer it is given that a null character is added at the end of every string literal. Why does a compiler do so?

I wanted to know Why compiler provides default copy constructor..Whats the strategy behind that idea. Thanks in Advance.

Well guys I am trying to encrypt (actually sign) data using Public and Private exponent and modulus, It is in C#.NET and I can't use RSACryptoServiceProvider because it needs also both prime numbers a

Why must I provide explicitly generic parameter types While the compiler should infer the type? public static T2 Cast<T1,T2>(this T1 arg) where T2 : class where T1 : class { return arg as T2; }

Hi I have been reading this kind of stuff in various docs register Tells the compiler to store the variable being declared in a CPU register. In standard C dialects, keyword register uses the follow

What I would like to ask you is, why the following snippet does not increment the variable $times ? #!/usr/bin/tclsh set logging { LC/0/1/CPU0:Jan 27 08:24:17.014 CET: ifmgr[175]: %PKT_INFRA-LINK-3-U

I am trying to understand a code in word2vec project. The file I am referring to is word2vec.c. The snippet of the code is : #define EXP_TABLE_SIZE 1000 #define MAX_EXP 6 //...snip... expTable = (rea

I have code sequences in C/C++ that contain lots of branches, something like this: if( condition1 ) return true; if( condition2 ) return true; ... return false; (which is equivalent to return conditi

What is the difference between increment operator(++) and an addition (+) operator?Why can't we can use + instead of ++? What are the advantages of ++/-- operators over +/-? And where it exactly appli

Using a H2 database, I would like to increment a value with one query. In MySQL I would go for something like INSERT INTO counts SET id = 1, c = 1 ON DUPLICATE KEY UPDATE c = c + 1; H2 does not suppo

A literal of 0.25 can be expressed with 1*2^(-2) so that we can get 1 as significand and -2 as exponent. What function can I get the significand/exponent with C#? int significand = GetSignificand(0.2

Possible Duplicate: Where does the word “pragma” come from? I know that pragma commands are used to pass compiler specific directives to a compiler implementation, but why was the keyword pragma cho

what imageformats that support 32bit floatingpoint values (per channel) are there? I would like to read and write uncompressed 32bit RGB images in c++ as simply as possible. There is openEXR but it's

In our database we have tables which have no single auto-increment primary key but instead a composite one, which may or may not include an auto-increment field as the first field for that primary key

At present, I have an ANT build which takes care of integrating Ivy dependency for both Javac and GWT compile very well. What puzzles me is, why doesn't the GPE GWT compiler recognise the eclipse clas

Take a quick look at the following snippet: <? set_time_limit(5); sleep(30); echo 'done'; ?> When I execute this on my boxes, the script takes the full 30 seconds, and displays 'done'. Why? Sho

Whenever I click the button flipButton once, the variable curPage added 4. I want to increment it by one. Could someone tell me what is wrong? Thanks - (void)viewDidLoad { [super viewDidLoad]; // Do a

If we type show 2,then we will get 2.But the question is show satisfies show :: Show a => a -> String,and 2 is polymorphic, if unfortunately show 2::Int differs from show 2::Integer we'd have

Im not sure why this isn't working, and with my limited jquery abilities I havent been able to located the source of the issue. The script below removes everything instead of just the parent element.

Would a modern compiler (GCC) be smart enough to optimize this no-op function out? I know that they can optimize no-op's out but I'm wondering how well they can determine if a function call is actuall

Possible Duplicate: Why is runtime library a compiler option rather than a linker option? I do not understand clearly why i have to choose type of RTL at the compiling stage? For example i have crea

This question already has an answer here: lvalue required as increment operand error 4 answers Why lvalue required as increment operand Error In a=b+(++c++); ? Just Wanted to assign 'b+(c+1)'

Newbie here. I have a big finite analysis code that needs to be run with high performance computing. People keep telling me Intel compiler usually gives better speed (I used to use gcc before). And I

Do you normally set your compiler to optimize for maximum speed or smallest code size? or do you manually configure individual optimization settings? Why? I notice most of the time people tend to just

Named local classes are very rarely used, usually local classes are anonymous. Does anybody know why the code below generates a compiler warning? public class Stuff<E> { Iterator<E> foo()

In case of long double x = 8.99999999999999999, the value is stored as double since ' L ' is not appended. Why can't the C compiler do type inference when I have already declared the variable x to be

I try to use using System.Numerics.BigInteger; and perfom modPow with negative exponent, I read documentation about Exception, that's why I did some trick //a^(-x) mod n == (a^(-1))^x mod n BigInteger