[M3commit] CVS Update: cm3

Thu Jan 6 19:52:42 CET 2011

You can't have one jmpbuf per procedure.  You need one per TRY scope, since they can be nested.

On Jan 6, 2011, at 11:35 AM, Jay K wrote:

> Hm. How do I single instance the "EF1"? The current code allocates a local "EF1" for each try.
> I guess, really, it is EF1, EF2, etc.
> So there should be a separate local for the jmpbuf pointer, and store it in each EF* block?
> How do I make just one jmpbuf pointer? I couldn't easily figure out how to in the front end, I need to read it more.
> 
> something like:
> 
> PROCEDURE F1() = BEGIN TRY1 do stuff1 TRY2 do stuff 2 TRY3 do stuff 3 END END END END F1;
> =>
> 
> void F1()
> {
>  jmp_buf* jb = 0;
>  EF1 a,b,c;
>  setjmp(a.jmpbuf = jb ? jb : (jb = alloca(sizeof(jmp_buf))); // TRY1
> do stuff 1...
>  setjmp(b.jmpbuf = jb ? jb : (jb = alloca(sizeof(jmp_buf))); // TRY2
> do stuff 2...
>   setjmp(c.jmpbuf = jb ? jb : (jb = alloca(sizeof(jmp_buf))); // TRY3
> do stuff 3...
>   }
> 
> (The actual syntactic and semantic correctness of this code -- the existance of the ternary operator, and that it only evaluates one side or the other, and that assignment is expression..I quite like those features....)
> 
> 
> Still, something I can't pin down strikes me as too simple here.
> 
> 
> If there is just one setjmp, and no integer(s) to keep track of additional progress, you only ever know the last place you were in a function.
> That doesn't seem adequate.
> 
> 
> What if a function raises an exception, catches it within itself, and then raises something else, and then wants to catch that?
> It won't know where to resume, right? It's just keep longjmping to the same place.
> 
> 
> In the Visual C++ runtime, there is "local unwind" and "global unwind".
> "local unwind" is like, "within the same functin", "global unwind" is across functions.
> I think somehow that is related here.
> 
> 
> e.g. how would you ensure forward progress in this:
> 
> 
> EXCEPTION E1;
> EXCEPTION E2;
> EXCEPTION E3;
> 
> 
> PROCEDURE F4() RAISES ANY =
> CONST Function = "F4 ";
> BEGIN
>   Put(Function & Int(Line())); NL();
>   TRY
>     Put(Function & Int(Line())); NL();
>     TRY
>       Put(Function & Int(Line())); NL();
>       TRY
>         Put(Function & Int(Line())); NL();
>         RAISE E1;
>       EXCEPT ELSE
>         RAISE E2;
>       END;
>     EXCEPT ELSE
>       RAISE E3;
>     END;
>   EXCEPT ELSE
>   END;
> END F4;
> 
> 
> Oddly in my test p251, the stack depth is not increased by TRY.
> 
> 
>  - Jay
> 
> Subject: Re: [M3commit] CVS Update: cm3
> From: hosking at cs.purdue.edu
> Date: Thu, 6 Jan 2011 09:22:09 -0500
> CC: m3commit at elegosoft.com
> To: jay.krell at cornell.edu
> 
> I am OK with what you have currently:
> 
> At each TRY:
> 
> 1. Check if a corresponding alloca block has been allocated by checking if the corresponding local variable is NIL.
> 2. If not, then alloca and save its pointer in the local variable
> 3. Execute the try block.
> 
> As you say, alloca should turn into an inline operation using the compiler's builtin implementation of alloca.
> 
> On Jan 6, 2011, at 1:02 AM, Jay K wrote:
> 
>  > Code size will suffer.
> 
> 
> Indeed. Unoptimized code size does suffer a lot, in functions that use try.
> Calling alloca, unoptimized, isn't small, and this adds n calls for n trys.
>   I thought it'd only be one call. I didn't realize our implementation is as poor as it is, since a better but still
>   portable implementation doesn't seem too too difficult.
> 
> 
> Can we maybe do the optimizations I indicate -- no more than one setjmp/alloca/pushframe per function?
> Using a local integer to record the position within the function?
> 
> 
> Or just give me a week or few to get stack walking working and then live the regression on other targets?
> (NT386 isn't likely to get stack walking, though it *is* certainly possible; NT does have a decent runtime here..)
> 
> 
> It *is* nice to not have have the frontend know about jmpbuf size.
> 
> 
> I looked into the "builtin_setjmp" stuff, but it can't be used so easily.
> It doesn't work for intra-function jumps, only inter-function.
> 
> 
>  - Jay
> 
> 
> From: jay.krell at cornell.edu
> To: hosking at cs.purdue.edu
> CC: m3commit at elegosoft.com
> Subject: RE: [M3commit] CVS Update: cm3
> Date: Thu, 6 Jan 2011 04:52:33 +0000
> 
> Ah..I'm doing more comparisons of release vs. head...but..I guess your point is, you'd rather have n locals, which the backend automatically merges, than n calls to alloca?
> It's not a huge difference -- there are still going to be n calls to setjmp and n calls to pthread_getspecific.
> The alloca calls will be dwarfed.
> Code size will suffer.
> 
> 
> And, even so, there are plenty of optimizations to be had, even if setjmp/pthread_getspecific is used.
> 
> 
>  - It could make a maximum of one call to setjmp/pthread_getspecific per function
>  - The calls to alloca could be merged. The frontend could keep track of how many calls it makes per function,
>    issue a multiplication, and offset each jmpbuf. It is a tradeoff.
> 
> 
> So, yes, given my current understanding, it is progress.
> The target-dependence is not worth it, imho.
> I'll still do some comparisons to release.
> 
> 
> I'll still be looking into using the gcc unwinder relatively soon.
> 
> 
>  - Jay
> 
> 
> Subject: Re: [M3commit] CVS Update: cm3
> From: hosking at cs.purdue.edu
> Date: Wed, 5 Jan 2011 21:14:17 -0500
> CC: m3commit at elegosoft.com
> To: jay.krell at cornell.edu
> 
> On Jan 5, 2011, at 9:08 PM, Jay K wrote:
> 
> Tony, um..well, um.. first, isn't that how it already worked maybe? Declaring a new local EF1 for each TRY? It looks like it.
> I'll do more testing.
> 
> Yes, it did.  I assume you simply have a local variable for each TRY block that is a pointer now instead of a jmp_buf.  Should be OK.
> 
> 
> So the additional inefficiency is multiplied the same as the rest of the preexisting inefficiency.
> And the preexisting inefficiency is way more than the increase.
> 
> And second, either way, it could be better.
> 
> Basically, the model should be, that if a function has any try or lock, it calls setjmp once.
> And then, it should have one volatile integer, that in a sense represents the line number.
>   But not really. It's like, every time you cross a TRY, the integer is incremented, every time you
>   cross a finally or unlock, the integer is decremented. Or rather, the value can be stored.
>   And then there is a maximum of one one handler per function, it switches on the integer
>    to decide where it got into the function and what it should do.
> 
> This is how other compilers work and it is a fairly simple sensible approach.
> 
>  - Jay
> 
> 
> Subject: Re: [M3commit] CVS Update: cm3
> From: hosking at cs.purdue.edu
> Date: Wed, 5 Jan 2011 20:49:24 -0500
> CC: m3commit at elegosoft.com
> To: jay.krell at cornell.edu
> 
> Note that you need a different jmpbuf for each nested TRY!
> 
> Antony Hosking | Associate Professor | Computer Science | Purdue University
> 305 N. University Street | West Lafayette | IN 47907 | USA
> Office +1 765 494 6001 | Mobile +1 765 427 5484
> 
> 
> 
> 
> On Jan 5, 2011, at 8:33 PM, Jay K wrote:
> 
> oops, that's not how I thought it worked. I'll do more testing and fix it -- check for NIL.
> 
>  - Jay
> 
> Subject: Re: [M3commit] CVS Update: cm3
> From: hosking at cs.purdue.edu
> Date: Wed, 5 Jan 2011 20:23:09 -0500
> CC: m3commit at elegosoft.com
> To: jay.krell at cornell.edu
> 
> Ah, yes, I guess you need a different jmpbuf for each TRY.  But now you are allocating on every TRY where previously the storage was statically allocated.  Do you really think this is progress?
> 
> On Jan 5, 2011, at 5:40 PM, Jay K wrote:
> 
> I've back with full keyboard if more explanation needed. The diff is actually fairly small to read.
> I understand it is definitely less efficient, a few more instructions for every try/lock.
> No extra function call, at least with gcc backend.
> I haven't tested NT386 yet. Odds are so/so that it works -- the change is written so that it should work
> but I have to test it to be sure, will to roughly tonight. And there probably is a function call there.
> 
>  - Jay
> 
> From: jay.krell at cornell.edu
> To: hosking at cs.purdue.edu
> Date: Wed, 5 Jan 2011 20:44:08 +0000
> CC: m3commit at elegosoft.com
> Subject: Re: [M3commit] CVS Update: cm3
> 
> I only have phone right now. I think it is fairly clear: the jumpbuf in EF1 is now allocated with alloca, and a pointer stored. It is definitely a bit less efficient, but the significant advantage is frontend no longer needs to know the size or alignment of a jumpbuf.
> 
> 
> As well, there is no longer the problem regarding jumpbuf aligned to more than 64 bits. I at least checked on Linux/PowerPC and alloca seems to align to 16 bytes. I don't have an HPUX machine currently to see if the problem is addressed there.
> 
> 
> The inefficiency of course can be dramatically mitigated via a stack walker. I wanted to do this first though, while more targets using setjmp.
> 
> - Jay/phone
> 
> Subject: Re: [M3commit] CVS Update: cm3
> From: hosking at cs.purdue.edu
> Date: Wed, 5 Jan 2011 13:35:59 -0500
> CC: jkrell at elego.de; m3commit at elegosoft.com
> To: jay.krell at cornell.edu
> 
> Can you provide a more descriptive checkin comment?  I don't know what has been done here without diving into the diff.
> 
> Antony Hosking | Associate Professor | Computer Science | Purdue University
> 305 N. University Street | West Lafayette | IN 47907 | USA
> Office +1 765 494 6001 | Mobile +1 765 427 5484
> 
> 
> 
> 
> On Jan 5, 2011, at 9:37 AM, Jay K wrote:
> 
> diff attached
> 
> > Date: Wed, 5 Jan 2011 15:34:55 +0000
> > To: m3commit at elegosoft.com
> > From: jkrell at elego.de
> > Subject: [M3commit] CVS Update: cm3
> > 
> > CVSROOT:	/usr/cvs
> > Changes by:	jkrell at birch.	11/01/05 15:34:55
> > 
> > Modified files:
> > cm3/m3-libs/m3core/src/C/Common/: Csetjmp.i3 
> > cm3/m3-libs/m3core/src/C/I386_CYGWIN/: Csetjmp.i3 
> > cm3/m3-libs/m3core/src/C/I386_MINGW/: Csetjmp.i3 
> > cm3/m3-libs/m3core/src/C/I386_NT/: Csetjmp.i3 
> > cm3/m3-libs/m3core/src/C/NT386/: Csetjmp.i3 
> > cm3/m3-libs/m3core/src/runtime/ex_frame/: RTExFrame.m3 
> > cm3/m3-libs/m3core/src/unix/Common/: Uconstants.c 
> > cm3/m3-sys/m3cc/gcc/gcc/m3cg/: parse.c 
> > cm3/m3-sys/m3front/src/misc/: Marker.m3 
> > cm3/m3-sys/m3front/src/stmts/: TryFinStmt.m3 TryStmt.m3 
> > cm3/m3-sys/m3middle/src/: M3RT.i3 M3RT.m3 Target.i3 Target.m3 
> > 
> > Log message:
> > use: extern INTEGER Csetjmp__Jumpbuf_size /* = sizeof(jmp_buf);
> > alloca(Csetjmp__Jumpbuf_size)
> > 
> > to allocate jmp_buf
> > 
> > - eliminates a large swath of target-dependent code
> > - allows for covering up the inability to declare
> > types with alignment > 64 bits
> > 
> > It is, granted, a little bit slower, in an already prety slow path.
> > Note that alloca isn't actually a function call, at least with gcc backend.
> > 
> <jmpbuf_alloca.txt>
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3commit/attachments/20110106/307a55d3/attachment-0002.html>