<html>
<head>
<style><!--
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
font-size: 10pt;
font-family:Verdana
}
--></style>
</head>
<body class='hmmessage'>
cache -- good point I had forgotten, thanks.<BR>
<BR>
<BR>
Still:<BR>
use the divide instruction, which is smallest, or multiply by reciprocal (which is generally a multiply and a shift)<BR>
(Given a 32x32=>64 multiply operation. x86 doesn't even have 32x32=>32, only 32x32=>64, I believe.)<BR>
Any 32bit division by a constant can be optimized this way and every C compiler knows it.<BR>
<BR>
<BR>
multiply by a constant using multiply instruction, or decompose into some adds?<BR>
The AMD64 optimization guide suggests speed optimizations where they give a sequence for multiplication for every constant up to 32, some are just to use mul. But many are one or two other instructions.<BR>
<BR>
Multiply by 5 is:<BR>
lea eax,[eax+eax*4]<BR>
<BR>
Multiply by 10 is:<BR>
lea eax,[eax+eax*4]<BR>
add eax,eax<BR><BR>
<BR>
The AMD64 manual even advises to inline 64bit shifts by a non-constant.<BR>
But I can't get Visual C++ to do that. It always calls a function.<BR>
<BR>
<BR> - Jay<BR><BR> <BR>> Date: Thu, 11 Feb 2010 07:03:09 -0500<BR>> From: hendrik@topoi.pooq.com<BR>> To: m3devel@elegosoft.com<BR>> Subject: Re: [M3devel] optimizing for size or speed?<BR>> <BR>> On Thu, Feb 11, 2010 at 08:53:08AM +0000, Jay K wrote:<BR>> > <BR>> > There are all kinds of equivalent code sequences.<BR>> > For the maintainer of m3back to chose among.<BR>> <BR>> In case of doubt, go for size; size all by itself costs time in cache <BR>> misses, paging, etc.<BR>> <BR>> Besides, it's possible to measure space.<BR>> <BR>> -- hendrik<BR> </body>
</html>