[M3commit] CVS Update: cm3

Jay Krell jkrell at elego.de
Wed Feb 24 13:28:00 CET 2010


CVSROOT:	/usr/cvs
Changes by:	jkrell at birch.	10/02/24 13:28:00

Modified files:
	cm3/m3-sys/m3back/src/: Codex86.i3 Codex86.m3 M3x86.m3 

Log message:
	another little helper function bites the dust, at least for NT386
	
	replace
	
	size_t __stdcall set_member(size_t elt, size_t* set)
	{
	register size_t word = elt / SET_GRAIN;
	register size_t bit  = elt % SET_GRAIN;
	return (set[word] & (((size_t)1) << bit)) != 0;
	}
	
	with bt instruction
	which does it all, and leaves the result in the carry flag (some
	gymnastics then to get the carry flag)
	
	before:
	0000003C: 56                 push        esi
	0000003D: 53                 push        ebx
	0000003E: FF 15 30 00 00 00  call        dword ptr [T$111+30h]   Shouldn't this be a direct call, save a byte?
	00000044: 89 45 F4           mov         dword ptr [ebp-0Ch],eax
	11 bytes, 4 instructions (plus the function!)
	
	after, attempt #1
	0000003C: 0F A3 1E           bt          dword ptr [esi],ebx
	0000003F: 0F 92 45 F0        setb        byte ptr [ebp-10h]
	00000043: 33 D2              xor         edx,edx
	00000045: 8A 55 F0           mov         dl,byte ptr [ebp-10h]
	00000048: 89 55 F4           mov         dword ptr [ebp-0Ch],edx
	15 bytes, 5 instructions
	so many to extract the carry!
	
	Probably a win, but larger.
	
	Attempt #2:
	
	Let's try a different approach to capturing the result:
	
	000000D1: 0F A3 1E           bt          dword ptr [esi],ebx
	000000D4: 1B D2              sbb         edx,edx
	000000D6: F7 DA              neg         edx
	000000D8: 89 55 F8           mov         dword ptr [ebp-8],edx
	
	10 bytes, 4 instructions
	(though I think the old approach could get by with 10, using a direct call)
	
	We can probably replace setcc other places similarly (see below).
	
	I had tried:
	xor eax, eax
	adc eax, 0
	
	That didn't work. I suspect xor clobbered the carry.
	We could make that work by reserving and clearing the register earlier.
	However
	it is 11 bytes instead of 1
	and this sbb, neg is how Visual C++ compiles:
	int F(unsigned a, unsigned b) { return a < b; }
	
	(further note: > is < but reversed, <= is < but inc instead of neg,
	and, importantly == and != are xor, op, sete regL, should be a nice
	win over our current strategy, if we can reserve/xor the register
	ahead of op)
	
	Note that now we get the various addressing modes (where set_singleton did not), however
	I couldn't get them to work, probably not encoding them
	with the right amount of indirection, so I force reg/reg addressing.
	Not ideal but probably still much better.
	
	Note that set_member is pretty heavily used, though none of these changes
	affects set that fit in 32 bits. We really should try to improve gcc backend?




More information about the M3commit mailing list