[M3devel] interface long, inlining large integer code, etc.

Tony Hosking hosking at cs.purdue.edu
Thu Mar 11 19:03:43 CET 2010


I'm going to resist this one.  It's not clear that it matters terribly much whether these utility routines are inlined or not.  If it is shown to be a performance bottleneck then perhaps.  But I suspect that there are very few programs that pass the procedures of the Word/Long interface around by value.

On 11 Mar 2010, at 09:45, Jay K wrote:

> I was thinking a bit more about this.
> Basically it can already be done -- we know the name of the function are compiling.
> If the function is Long__Extract, Long__Times, etc., we can inline.
> If it isn't, we can assume the existance and interface of such functions, and call them.
>   Instead of calling anything in hand.c.
>   Maybe even give them custom calling conventions.
>   The front end knows stuff about interface Word/Long. It seems reasonable for m3back to also.
>  
>  
> The open question however is that there is no place to "hang" inlined versions of signed operations, what you might call Integer__Extract, Integer__Divide, LongInt__Div, LongInt__Times, LongInt__Mod, etc.
>  
>  
> Maybe they should be in C:\dev2\cm3.2\m3-libs\m3core\src\types?
>  
>  
> Similarly open question would be the set functions.
> RTHooks.SetLT?
> RTSet.LT?
> C:\dev2\cm3.2\m3-libs\m3core\src\types\set?
> C:\dev2\cm3.2\m3-libs\m3core\src\types\bigset?
>  (as opposed to word-sized set, besides that eq/ne are inlined by the frontend for "medium" sizes)
>  
>  
> Is it clear what my questions are?
>  
>  
> Let me try again.
> Consider Long.Rotate. Assume it takes a bunch of code.
> m3back could notice it is compiling Long__Rotate, and generate the code inline.
> Otherwise it can "know" such a function exists and generate a call to it.
> This is a reasonable seeming model that I just came up with.
>  
>  
> That does double the paths in m3back, granted.
>  
> For several such m3cg operations, there is already a natural function to treat this way -- stuff in interface Long.
>  
>  
> However, for a few operations, there is not already a place -- signed operations and operations on large sets.
>  
>  
> As well, there is a presumed question/answer/non-answer: everything works today.
> But should it be changed?
> Is it better to reduce/eliminate hand.c and instead teach the Modula-3 code in m3back how to generate assembly?
> "Better" how?
> More efficient? Maybe. The factor here is that hand.c may or may not be optimized, but m3back operates in a mode where it always optimizes to some extent. It generally generates code that is significantly better than unoptimized C, and significantly worse than optimized C, depending -- it does pretty well, but it never inlines, but it only does a certain small class of smart things. It keeps values in registers somewhat, it doesn't eliminate dead stores, doesn't unroll loops.
>  
>  
> Another factor is the principle of having C code or not.
> We vehemently agree that C code is preferred in m3core/unix.
> But otherwise many people here kind of dislike or discourage C on principle/philosophical grounds.
>  I'm not sure I agree, however by agreeing, I introduce an interesting technical challenge. :)
>  
>  
> Again, must we really call it interface long?
> Does that really "sound like":
>   INTEGER is to Word as LONGINT is to x?
>  
> LongWord maybe?
>  
>  
>  - Jay
> 
>  
> > From: jay.krell at cornell.edu
> > To: hosking at cs.purdue.edu
> > CC: m3commit at elegosoft.com
> > Subject: RE: [M3commit] CVS Update: cm3
> > Date: Thu, 11 Mar 2010 04:10:40 +0000
> > 
> > 
> > Long.LeftShift is not a great example. Sorry.
> > Long.Shift, Long.Rotate, Long.Times, LongDivide.
> > Those are or "could be" large. Currently the later
> > three all call functions and are therefore short.
> > 
> > 
> > Integer.Multiple, Divide, Remainder, similarly uncertain.
> > They call functions, the functions aren't short.
> > 
> > 
> > I was also thinking, some of these functions, esp.
> > 64bit shift/rotate could be shorter as loops.
> > But I think not worthwhile.
> > 
> > 
> > To be clear, even though "parse.c" "never" generates
> > function calls (except for set operations), the gcc
> > backend itself often does. The functions are in libgcc.
> > 
> > 
> > You know..I find it hard to decide about compiler generated
> > function calls. Definitely for size optimization they
> > can be a win. But I just somehow vaguely don't like them.
> > In many cases, the decision is not difficult. set_singleton
> > and set_member are good examples. They could be inlined
> > and be very small. But Shift, Rotate, Multiply, Divide,
> > it gets less clear.
> > 
> > 
> > But no matter how large an inline form, I think
> > when interface word/long maps directly to these operations,
> > they should be inlined.
> > 
> > 
> > And then if there are places that don't inline, they should
> > call the "instantiations" in interface word/long.
> > 
> > 
> > And then, furthermore, we should consider an interface
> > for "Integer.Divide/Multiple/Remainder"?
> > A place to park the possibly large code?
> > 
> > 
> > - Jay
> > 
> > 
> > ----------------------------------------
> > > From: hosking at cs.purdue.edu
> > > Date: Wed, 10 Mar 2010 21:01:54 -0500
> > > To: jay.krell at cornell.edu
> > > CC: m3commit at elegosoft.com
> > > Subject: Re: [M3commit] CVS Update: cm3
> > >
> > > I am confused. The intention is that Long.LeftShift is always inlined wherever it is called. Is the code for it really that hairy?
> > >
> > > On 10 Mar 2010, at 20:30, Jay K wrote:
> > >
> > >>
> > >> If there exists an inline expansion, and it is large, but there is a function whose job it is to hold the expansion, whether it is inline or not, and the backend was informed of this function's name, and whether or not it was currently producing it, the backend could generate the inline expansion for Long__LeftShift, but otherwise generate calls to it.
> > >>
> > >>
> > >> That is, e.g. m3cg.left_shift(type := int64|word64) could chose to either call Long__LeftShift or generate the body of Long__LeftShift. Not based on some "is inlining profitable heuristic", but specifically if told it is generating the function Long__LeftShift or not.
> > >>
> > >>
> > >> That is, there's no point in having a C function m3_left_shift64 or somesuch, and having Long__LeftShift call it. Instead a backend should generate the body of Long__LeftShift inline when told it is generating that function, vs. generating a call to that function when it is otherwise asked to do a 64bit leftshift if it decides that inlining it every time is too large.
> > >>
> > >> Granted at least two things:
> > >> Given 64bit target, Long__LeftShift vs. Word__LeftShift is ambiguous.
> > >> And I'm inlined to just always inline anyway.
> > >>
> > >>
> > >> (I still don't like the term "Long" here. It doesn't convey unsigned.)
> > >>
> > >>
> > >> - Jay
> > >>
> > >>
> > >>
> > >> ________________________________
> > >>> Subject: Re: [M3commit] CVS Update: cm3
> > >>> From: hosking at cs.purdue.edu
> > >>> Date: Wed, 10 Mar 2010 19:57:17 -0500
> > >>> CC: m3commit at elegosoft.com
> > >>> To: jay.krell at cornell.edu
> > >>>
> > >>>
> > >>>
> > >>> Yes, someone can pass the function as a parameter.
> > >>>
> > >>> I don't understand the rest of what you are saying.
> > >>>
> > >>>
> > >>> Antony Hosking | Associate Professor | Computer Science | Purdue University
> > >>> 305 N. University Street | West Lafayette | IN 47907 | USA
> > >>> Office +1 765 494 6001 | Mobile +1 765 427 5484
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> On 10 Mar 2010, at 17:27, Jay K wrote:
> > >>>
> > >>>
> > >>> Hey, should we maybe have a flag that indicates we are producing a function that is meant to provide and only provide exactly the functionality of the "one" m3cg call?
> > >>> You know, that'd be a great hint to:
> > >>>
> > >>> - definitely definitely definitely definitely inline
> > >>> - if we know this thing existed, we could use it instead of a .c file.
> > >>> - along with the function's name for the invocations that aren't the function?
> > >>>
> > >>>
> > >>> Why provide the function anyway? In case someone takes the address?
> > >>>
> > >>>
> > >>>
> > >>> - Jay
> > >>>
> > >>>
> > >>> ________________________________
> > >>> From: jay.krell at cornell.edu
> > >>> To: hosking at cs.purdue.edu
> > >>> Date: Mon, 8 Mar 2010 16:04:45 +0000
> > >>> CC: m3commit at elegosoft.com
> > >>> Subject: Re: [M3commit] CVS Update: cm3
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> Fair enough.
> > >>>
> > >>>
> > >>>
> > >>> - Jay
> > >>>
> > >>>
> > >>>
> > >>> ________________________________
> > >>>
> > >>> From: hosking at cs.purdue.edu
> > >>> Date: Mon, 8 Mar 2010 10:29:00 -0500
> > >>> To: jay.krell at cornell.edu
> > >>> CC: m3commit at elegosoft.com
> > >>> Subject: Re: [M3commit] CVS Update: cm3
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> You can't call the support function Long__Extract because we already have a Long__Extract in m3core/src/word. It is the library representative of the front-end compiler-generated (inlined) Long.Extract.
> > >>>
> > >>>
> > >>>
> > >>> On 8 Mar 2010, at 07:30, Jay K wrote:
> > >>>
> > >>>
> > >>>
> > >>> diff attached
> > >>>
> > >>>
> > >>> If folks really want to use tables or function calls here, let me know.
> > >>>
> > >>>
> > >>> The data was historically problematic, though we worked out the problems.
> > >>> - It was initialized at runtime which has a race condition, fixed years ago.
> > >>> - It can't be "exported" on NT386 (which is the only platform that uses it) and must be statically linked.
> > >>>
> > >>> The data is still used by Word.Insert and Long.Insert is still a function, but those
> > >>> are my next targets.
> > >>>
> > >>>
> > >>> This is a case where the user does write a function call. Word.Extract or Long.Extract.
> > >>> (So the function should have been called Long__Extract.)
> > >>>
> > >>>
> > >>> - Jay
> > >>>
> > >>> Date: Mon, 8 Mar 2010 13:26:36 +0000
> > >>> To: m3commit at elegosoft.com
> > >>> From: jkrell at elego.de
> > >>> Subject: [M3commit] CVS Update: cm3
> > >>>
> > >>> CVSROOT: /usr/cvs
> > >>> Changes by: jkrell at birch. 10/03/08 13:26:36
> > >>>
> > >>> Modified files:
> > >>> cm3/m3-sys/m3back/src/: M3x86.m3 Stackx86.i3 Stackx86.m3
> > >>>
> > >>> Log message:
> > >>> "rewrite" extract to not use tables and always be inlined for 64bit
> > >>>
> > >>> equivalent C code:
> > >>>
> > >>> UT __stdcall extract(UT x, uint32 offset, uint32 count)
> > >>> {
> > >>> x>>= offset;
> > >>> x &= ~((~(UT)0) << count);
> > >>> return x;
> > >>> }
> > >>>
> > >>> extract32
> > >>> 00000729: 8B4DEC MOV ECX tv.126[_m]
> > >>> 0000072C: 8B5DF4 MOV EBX tv.124[_a32]
> > >>> 0000072F: D3EB SHR EBX
> > >>> 00000731: 8BD1 MOV EDX ECX This is pointless and I'll try to remove.
> > >>> 00000733: 8B4DE4 MOV ECX tv.128[_n]
> > >>> 00000736: BEFFFFFFFF MOV ESI $4294967295 I'll look for smaller form of this. or esi -1?
> > >>> 0000073B: D3E6 SHL ESI
> > >>> 0000073D: F7D6 NOT ESI
> > >>> 0000073F: 23DE AND EBX ESI
> > >>>
> > >>> extract64
> > >>> 000008E4: 8B4DD8 MOV ECX tv.134[_m]
> > >>> 000008E7: 8B5DE8 MOV EBX tv.131[_a64]
> > >>> 000008EA: 8B55EC MOV EDX tv.131[_a64]+4
> > >>> 000008ED: 0FADD3 SHRD EBX EDX ECX
> > >>> 000008F0: D3EA SHR EDX
> > >>> 000008F2: F6C120 TEST ECX $32
> > >>> 000008F5: 7400 JE rel8 L.107
> > >>> 000008F7: 8BDA MOV EBX EDX
> > >>> 000008F9: 33D2 XOR EDX EDX
> > >>> set_label L.107
> > >>> 000008FB: 8BF1 MOV ESI ECX This is pointless and I'll try to remove.
> > >>> 000008FD: 8B4DD0 MOV ECX tv.136[_n]
> > >>> 00000900: BFFFFFFFFF MOV EDI $4294967295 I'll look for smaller form of this. or edi -1?
> > >>> 00000905: B8FFFFFFFF MOV EAX $4294967295 I'll look for smaller form of this. (heck, at least mov edi, eax)
> > >>> 0000090A: 0FA5F8 SHLD EAX EDI ECX
> > >>> 0000090D: D3E7 SHL EDI
> > >>> 0000090F: F6C120 TEST ECX $32
> > >>> 00000912: 7400 JE rel8 L.108
> > >>> 00000914: 8BC7 MOV EAX EDI
> > >>> 00000916: 33FF XOR EDI EDI
> > >>> set_label L.108
> > >>> 00000918: F7D7 NOT EDI
> > >>> 0000091A: F7D0 NOT EAX
> > >>> 0000091C: 23DF AND EBX EDI
> > >>> 0000091E: 23D0 AND EDX EAX
> > >>>
> > >>> having n or m and n (or just m? I think so.) be constant leads to better code
> > >>>
> > >>> some other small cleanup, like avoiding calling find twice,
> > >>> I don't see why it was that way
> > >>>
> > >>> <1.txt>
> > >>>
> > >>>
> > > 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20100311/bcccb98b/attachment-0002.html>


More information about the M3devel mailing list