[M3devel] optimizing for size or speed?

Jay K jay.krell at cornell.edu
Fri Mar 5 02:00:37 CET 2010


I still find these decisions difficult.
Esp. when the option is to call a function or not.
Esp. for "builtin" stuff like 64bit math and set operations.
 
 - Jay

________________________________
> From: jay.krell at cornell.edu
> To: hendrik at topoi.pooq.com; m3devel at elegosoft.com
> Date: Thu, 11 Feb 2010 12:41:12 +0000
> Subject: Re: [M3devel] optimizing for size or speed?
>
>
>
>
>
>
>
>
> cache -- good point I had forgotten, thanks.
>
>
>
>
>
> Still:
>
> use the divide instruction, which is smallest, or multiply by reciprocal (which is generally a multiply and a shift)
>
> (Given a 32x32=>64 multiply operation. x86 doesn't even have 32x32=>32, only 32x32=>64, I believe.)
>
> Any 32bit division by a constant can be optimized this way and every C compiler knows it.
>
>
>
>
>
> multiply by a constant using multiply instruction, or decompose into some adds?
>
> The AMD64 optimization guide suggests speed optimizations where they give a sequence for multiplication for every constant up to 32, some are just to use mul. But many are one or two other instructions.
>
>
>
> Multiply by 5 is:
>
> lea eax,[eax+eax*4]
>
>
>
> Multiply by 10 is:
>
> lea eax,[eax+eax*4]
>
> add eax,eax
>
>
>
>
> The AMD64 manual even advises to inline 64bit shifts by a non-constant.
>
> But I can't get Visual C++ to do that. It always calls a function.
>
>
>
>
> - Jay
>
>
>> Date: Thu, 11 Feb 2010 07:03:09 -0500
>> From: hendrik at topoi.pooq.com
>> To: m3devel at elegosoft.com
>> Subject: Re: [M3devel] optimizing for size or speed?
>>
>> On Thu, Feb 11, 2010 at 08:53:08AM +0000, Jay K wrote:
>>>
>>> There are all kinds of equivalent code sequences.
>>> For the maintainer of m3back to chose among.
>>
>> In case of doubt, go for size; size all by itself costs time in cache
>> misses, paging, etc.
>>
>> Besides, it's possible to measure space.
>>
>> -- hendrik 		 	   		  


More information about the M3devel mailing list