[M3commit] CVS Update: cm3

Tony Hosking hosking at cs.purdue.edu
Thu Jan 6 15:22:09 CET 2011


I am OK with what you have currently:

At each TRY:

1. Check if a corresponding alloca block has been allocated by checking if the corresponding local variable is NIL.
2. If not, then alloca and save its pointer in the local variable
3. Execute the try block.

As you say, alloca should turn into an inline operation using the compiler's builtin implementation of alloca.

On Jan 6, 2011, at 1:02 AM, Jay K wrote:

>  > Code size will suffer.
> 
> 
> Indeed. Unoptimized code size does suffer a lot, in functions that use try.
> Calling alloca, unoptimized, isn't small, and this adds n calls for n trys.
>   I thought it'd only be one call. I didn't realize our implementation is as poor as it is, since a better but still
>   portable implementation doesn't seem too too difficult.
> 
> 
> Can we maybe do the optimizations I indicate -- no more than one setjmp/alloca/pushframe per function?
> Using a local integer to record the position within the function?
> 
> 
> Or just give me a week or few to get stack walking working and then live the regression on other targets?
> (NT386 isn't likely to get stack walking, though it *is* certainly possible; NT does have a decent runtime here..)
> 
> 
> It *is* nice to not have have the frontend know about jmpbuf size.
> 
> 
> I looked into the "builtin_setjmp" stuff, but it can't be used so easily.
> It doesn't work for intra-function jumps, only inter-function.
> 
> 
>  - Jay
> 
> 
> From: jay.krell at cornell.edu
> To: hosking at cs.purdue.edu
> CC: m3commit at elegosoft.com
> Subject: RE: [M3commit] CVS Update: cm3
> Date: Thu, 6 Jan 2011 04:52:33 +0000
> 
> Ah..I'm doing more comparisons of release vs. head...but..I guess your point is, you'd rather have n locals, which the backend automatically merges, than n calls to alloca?
> It's not a huge difference -- there are still going to be n calls to setjmp and n calls to pthread_getspecific.
> The alloca calls will be dwarfed.
> Code size will suffer.
> 
> 
> And, even so, there are plenty of optimizations to be had, even if setjmp/pthread_getspecific is used.
> 
> 
>  - It could make a maximum of one call to setjmp/pthread_getspecific per function
>  - The calls to alloca could be merged. The frontend could keep track of how many calls it makes per function,
>    issue a multiplication, and offset each jmpbuf. It is a tradeoff.
> 
> 
> So, yes, given my current understanding, it is progress.
> The target-dependence is not worth it, imho.
> I'll still do some comparisons to release.
> 
> 
> I'll still be looking into using the gcc unwinder relatively soon.
> 
> 
>  - Jay
> 
> 
> Subject: Re: [M3commit] CVS Update: cm3
> From: hosking at cs.purdue.edu
> Date: Wed, 5 Jan 2011 21:14:17 -0500
> CC: m3commit at elegosoft.com
> To: jay.krell at cornell.edu
> 
> On Jan 5, 2011, at 9:08 PM, Jay K wrote:
> 
> Tony, um..well, um.. first, isn't that how it already worked maybe? Declaring a new local EF1 for each TRY? It looks like it.
> I'll do more testing.
> 
> Yes, it did.  I assume you simply have a local variable for each TRY block that is a pointer now instead of a jmp_buf.  Should be OK.
> 
> 
> So the additional inefficiency is multiplied the same as the rest of the preexisting inefficiency.
> And the preexisting inefficiency is way more than the increase.
> 
> And second, either way, it could be better.
> 
> Basically, the model should be, that if a function has any try or lock, it calls setjmp once.
> And then, it should have one volatile integer, that in a sense represents the line number.
>   But not really. It's like, every time you cross a TRY, the integer is incremented, every time you
>   cross a finally or unlock, the integer is decremented. Or rather, the value can be stored.
>   And then there is a maximum of one one handler per function, it switches on the integer
>    to decide where it got into the function and what it should do.
> 
> This is how other compilers work and it is a fairly simple sensible approach.
> 
>  - Jay
> 
> 
> Subject: Re: [M3commit] CVS Update: cm3
> From: hosking at cs.purdue.edu
> Date: Wed, 5 Jan 2011 20:49:24 -0500
> CC: m3commit at elegosoft.com
> To: jay.krell at cornell.edu
> 
> Note that you need a different jmpbuf for each nested TRY!
> 
> Antony Hosking | Associate Professor | Computer Science | Purdue University
> 305 N. University Street | West Lafayette | IN 47907 | USA
> Office +1 765 494 6001 | Mobile +1 765 427 5484
> 
> 
> 
> 
> On Jan 5, 2011, at 8:33 PM, Jay K wrote:
> 
> oops, that's not how I thought it worked. I'll do more testing and fix it -- check for NIL.
> 
>  - Jay
> 
> Subject: Re: [M3commit] CVS Update: cm3
> From: hosking at cs.purdue.edu
> Date: Wed, 5 Jan 2011 20:23:09 -0500
> CC: m3commit at elegosoft.com
> To: jay.krell at cornell.edu
> 
> Ah, yes, I guess you need a different jmpbuf for each TRY.  But now you are allocating on every TRY where previously the storage was statically allocated.  Do you really think this is progress?
> 
> On Jan 5, 2011, at 5:40 PM, Jay K wrote:
> 
> I've back with full keyboard if more explanation needed. The diff is actually fairly small to read.
> I understand it is definitely less efficient, a few more instructions for every try/lock.
> No extra function call, at least with gcc backend.
> I haven't tested NT386 yet. Odds are so/so that it works -- the change is written so that it should work
> but I have to test it to be sure, will to roughly tonight. And there probably is a function call there.
> 
>  - Jay
> 
> From: jay.krell at cornell.edu
> To: hosking at cs.purdue.edu
> Date: Wed, 5 Jan 2011 20:44:08 +0000
> CC: m3commit at elegosoft.com
> Subject: Re: [M3commit] CVS Update: cm3
> 
> I only have phone right now. I think it is fairly clear: the jumpbuf in EF1 is now allocated with alloca, and a pointer stored. It is definitely a bit less efficient, but the significant advantage is frontend no longer needs to know the size or alignment of a jumpbuf.
> 
> 
> As well, there is no longer the problem regarding jumpbuf aligned to more than 64 bits. I at least checked on Linux/PowerPC and alloca seems to align to 16 bytes. I don't have an HPUX machine currently to see if the problem is addressed there.
> 
> 
> The inefficiency of course can be dramatically mitigated via a stack walker. I wanted to do this first though, while more targets using setjmp.
> 
> - Jay/phone
> 
> Subject: Re: [M3commit] CVS Update: cm3
> From: hosking at cs.purdue.edu
> Date: Wed, 5 Jan 2011 13:35:59 -0500
> CC: jkrell at elego.de; m3commit at elegosoft.com
> To: jay.krell at cornell.edu
> 
> Can you provide a more descriptive checkin comment?  I don't know what has been done here without diving into the diff.
> 
> Antony Hosking | Associate Professor | Computer Science | Purdue University
> 305 N. University Street | West Lafayette | IN 47907 | USA
> Office +1 765 494 6001 | Mobile +1 765 427 5484
> 
> 
> 
> 
> On Jan 5, 2011, at 9:37 AM, Jay K wrote:
> 
> diff attached
> 
> > Date: Wed, 5 Jan 2011 15:34:55 +0000
> > To: m3commit at elegosoft.com
> > From: jkrell at elego.de
> > Subject: [M3commit] CVS Update: cm3
> > 
> > CVSROOT:	/usr/cvs
> > Changes by:	jkrell at birch.	11/01/05 15:34:55
> > 
> > Modified files:
> > cm3/m3-libs/m3core/src/C/Common/: Csetjmp.i3 
> > cm3/m3-libs/m3core/src/C/I386_CYGWIN/: Csetjmp.i3 
> > cm3/m3-libs/m3core/src/C/I386_MINGW/: Csetjmp.i3 
> > cm3/m3-libs/m3core/src/C/I386_NT/: Csetjmp.i3 
> > cm3/m3-libs/m3core/src/C/NT386/: Csetjmp.i3 
> > cm3/m3-libs/m3core/src/runtime/ex_frame/: RTExFrame.m3 
> > cm3/m3-libs/m3core/src/unix/Common/: Uconstants.c 
> > cm3/m3-sys/m3cc/gcc/gcc/m3cg/: parse.c 
> > cm3/m3-sys/m3front/src/misc/: Marker.m3 
> > cm3/m3-sys/m3front/src/stmts/: TryFinStmt.m3 TryStmt.m3 
> > cm3/m3-sys/m3middle/src/: M3RT.i3 M3RT.m3 Target.i3 Target.m3 
> > 
> > Log message:
> > use: extern INTEGER Csetjmp__Jumpbuf_size /* = sizeof(jmp_buf);
> > alloca(Csetjmp__Jumpbuf_size)
> > 
> > to allocate jmp_buf
> > 
> > - eliminates a large swath of target-dependent code
> > - allows for covering up the inability to declare
> > types with alignment > 64 bits
> > 
> > It is, granted, a little bit slower, in an already prety slow path.
> > Note that alloca isn't actually a function call, at least with gcc backend.
> > 
> <jmpbuf_alloca.txt>
> 
> 
> 
> 
> 
> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3commit/attachments/20110106/814f1348/attachment-0002.html>


More information about the M3commit mailing list