[M3commit] CVS Update: cm3
Jay K
jay.krell at cornell.edu
Tue Mar 2 14:18:52 CET 2010
roughly the attached, though I know the case/space diff is also there
It's pretty hard to find this stuff via cvsweb/changelog/etc.....
- Jay
> Date: Tue, 2 Mar 2010 13:52:29 +0000
> To: m3commit at elegosoft.com
> From: jkrell at elego.de
> Subject: [M3commit] CVS Update: cm3
>
> CVSROOT: /usr/cvs
> Changes by: jkrell at birch. 10/03/02 13:52:29
>
> Modified files:
> cm3/m3-sys/m3back/src/: Codex86.i3 Codex86.m3 M3x86.m3
> Stackx86.i3 Stackx86.m3
>
> Log message:
> somewhat consolidate shifting code
>
> no more "binOpWithShiftCount"
>
> inline all 64bit shifts, even when no constants involved
> (constants get inlined better)
>
> worst cases:
>
> shift right 64 (from the AMD manual):
> 00002217: 0F AD D3 shrd ebx,edx,cl
> 0000221A: D3 EA shr edx,cl
> 0000221C: F6 C1 20 test cl,20h
> 0000221F: 74 04 je 00002225
> 00002221: 8B DA mov ebx,edx
> 00002223: 33 D2 xor edx,edx
> 00002225:
>
> shift left 64 (from the AMD manual):
> 0000244B: 0F A5 DA shld edx,ebx,cl
> 0000244E: D3 E3 shl ebx,cl
> 00002450: F6 C1 20 test cl,20h
> 00002453: 74 04 je 00002459
> 00002455: 8B D3 mov edx,ebx
> 00002457: 33 DB xor ebx,ebx
> 00002459:
>
> Those sequences are quite subtle.
> I'm not sure I understand.
> Shift by between 32 and 63:
> The "wrong" register is shifted, but it is done modulo-32,
> so the correct result is had, in the "wrong" register
> then the register is moved to its correct place.
> That is:
> edx:eax << 33
> is straightforwardly, after testing if the shift count is >32:
> edx = eax
> edx <<= 1 (shift count - 32)
> eax = 0
> however the above does the shift before the move,
> since the modulo makes it correct:
> eax <<= (33 % 32) which is eax <<= 1
> edx = eax
> eax = 0
>
> The way it detects >=32 is very subtle to me.
> It checks if the 32 bit is set.
> If it is not set, the work is deemed done.
> Any value in 33-64 inclusive has it set, and gets shifted an extra 32,
> via mov and xor.
> Any value in 0-31 has it clear, done.
> The case of 32 exactly works with the first two instructions.
>
> I guess.
> I didn't come up with this, it is in the AMD optimization manual.
>
> wierdo shift 32: (no change)
> 00006CD4: 83 F9 00 cmp ecx,0
> 00006CD7: 7D 0F jge 00006CE8
> 00006CD9: F7 D9 neg ecx
> 00006CDB: 83 F9 20 cmp ecx,20h
> 00006CDE: 7D 04 jge 00006CE4
> 00006CE0: D3 EB shr ebx,cl
> 00006CE2: EB 0B jmp 00006CEF
> 00006CE4: 33 DB xor ebx,ebx
> 00006CE6: EB 07 jmp 00006CEF
> 00006CE8: 83 F9 20 cmp ecx,20h
> 00006CEB: 7D F7 jge 00006CE4
> 00006CED: D3 E3 shl ebx,cl
> 00006CEF:
>
> wierdo shift 64:
> 000071E9: 83 F9 00 cmp ecx,0
> 000071EC: 7D 1D jge 0000720B
> 000071EE: F7 D9 neg ecx
> 000071F0: 83 F9 40 cmp ecx,40h
> 000071F3: 7D 10 jge 00007205
> 000071F5: 0F AD F3 shrd ebx,esi,cl
> 000071F8: D3 EE shr esi,cl
> 000071FA: F6 C1 20 test cl,20h
> 000071FD: 74 04 je 00007203
> 000071FF: 8B DE mov ebx,esi
> 00007201: 33 F6 xor esi,esi
> 00007203: EB 19 jmp 0000721E
> 00007205: 33 DB xor ebx,ebx
> 00007207: 33 F6 xor esi,esi
> 00007209: EB 13 jmp 0000721E
> 0000720B: 83 F9 40 cmp ecx,40h
> 0000720E: 7D F5 jge 00007205
> 00007210: 0F A5 DE shld esi,ebx,cl
> 00007213: D3 E3 shl ebx,cl
> 00007215: F6 C1 20 test cl,20h
> 00007218: 74 04 je 0000721E
> 0000721A: 8B F3 mov esi,ebx
> 0000721C: 33 DB xor ebx,ebx
> 0000721E:
>
> as to what shift by FIRST(INTEGER) does, I need to check (but notice that wierd shift 32 isn't changed)
> We might as well first compare to -64 before the negate, same instruction count, more
> obviously correct.
> there are a few extra instructions in wierd shift 64, the parts that
> shift by more than 32 or more than 64 have common pieces that
> can be shared ("tail merged") and there is a branch to a jmp
>
> We should see about optimizing the wierd shift 64 case.
> A few instructions can easily be saved.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3commit/attachments/20100302/a76d3e91/attachment-0002.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: 2.txt
URL: <http://m3lists.elegosoft.com/pipermail/m3commit/attachments/20100302/a76d3e91/attachment-0002.txt>
More information about the M3commit
mailing list