[M3commit] CVS Update: cm3

Tony Hosking hosking at cs.purdue.edu
Thu Jan 6 20:11:04 CET 2011


At this point, we are trying to move away from the setjmp implementation to one that relies on unwind support, so I don't think the effort here is worthwhile.

Antony Hosking | Associate Professor | Computer Science | Purdue University
305 N. University Street | West Lafayette | IN 47907 | USA
Office +1 765 494 6001 | Mobile +1 765 427 5484




On Jan 6, 2011, at 2:00 PM, Jay K wrote:

> 
> I believe you can, but it'd take significant work in the frontend.
> The jmpbuf should identify merely which procedure/frame to return to.
> There would also be a volatile local integer, that gets altered at certain points through the function.
> When setjmp returns exceptionally, you'd switch on that integer to determine where to "really" go.
> This is analogous to how other systems work -- NT/x86 has a highly optimized frame based exception
> handling. Instead of a generic thread local, FS:0 is reserved to be the head of the linked list of frames.
> Instead of setjmp, the compiler pessimizes appropriately.
> 
> 
> So the result is that a function with one or more tries, or one or more locals with destructors,
> puts one node on the FS:0 list, and then mucks with the volatile local integer to indicate
> where in the function it is.
> 
> 
> If NT/x86 were inefficient more analogous to current Modula-3, it'd link/unlink in FS:0 more often.
> 
> 
> It is more work through, granted, I can understand that.
> And given that we have a much better option for many platforms, the payoff would be reduced.
> 
> 
> Anyway, I'm trying what you say, like for TRY within a loop.
> 
> 
> I should point out that alloca has an extra inefficiency vs. the previous approach.
> It aligns more. So it is using more stack than the other way.
> And it might pessimize codegen in other ways.
> 
> 
> The gcc code appears somewhat similar..I think the tables merely describe, again, which
> function/frame to return to, and that within the frame there is a local integer to determine
> more precisely what to do. I'm not sure. I saw mention of a switch.
> 
> 
>  - Jay
> 
> ________________________________
>> Subject: Re: [M3commit] CVS Update: cm3
>> From: hosking at cs.purdue.edu
>> Date: Thu, 6 Jan 2011 13:52:42 -0500
>> CC: m3commit at elegosoft.com
>> To: jay.krell at cornell.edu
>> 
>> You can't have one jmpbuf per procedure. You need one per TRY scope,
>> since they can be nested.
>> 
>> 
>> 
>> On Jan 6, 2011, at 11:35 AM, Jay K wrote:
>> 
>> Hm. How do I single instance the "EF1"? The current code allocates a
>> local "EF1" for each try.
>> I guess, really, it is EF1, EF2, etc.
>> So there should be a separate local for the jmpbuf pointer, and store
>> it in each EF* block?
>> How do I make just one jmpbuf pointer? I couldn't easily figure out how
>> to in the front end, I need to read it more.
>> 
>> something like:
>> 
>> PROCEDURE F1() = BEGIN TRY1 do stuff1 TRY2 do stuff 2 TRY3 do stuff 3
>> END END END END F1;
>> =>
>> 
>> void F1()
>> {
>> jmp_buf* jb = 0;
>> EF1 a,b,c;
>> setjmp(a.jmpbuf = jb ? jb : (jb = alloca(sizeof(jmp_buf))); // TRY1
>> do stuff 1...
>> setjmp(b.jmpbuf = jb ? jb : (jb = alloca(sizeof(jmp_buf))); // TRY2
>> do stuff 2...
>> setjmp(c.jmpbuf = jb ? jb : (jb = alloca(sizeof(jmp_buf))); // TRY3
>> do stuff 3...
>> }
>> 
>> (The actual syntactic and semantic correctness of this code -- the
>> existance of the ternary operator, and that it only evaluates one side
>> or the other, and that assignment is expression..I quite like those
>> features....)
>> 
>> 
>> Still, something I can't pin down strikes me as too simple here.
>> 
>> 
>> If there is just one setjmp, and no integer(s) to keep track of
>> additional progress, you only ever know the last place you were in a
>> function.
>> That doesn't seem adequate.
>> 
>> 
>> What if a function raises an exception, catches it within itself, and
>> then raises something else, and then wants to catch that?
>> It won't know where to resume, right? It's just keep longjmping to the
>> same place.
>> 
>> 
>> In the Visual C++ runtime, there is "local unwind" and "global unwind".
>> "local unwind" is like, "within the same functin", "global unwind" is
>> across functions.
>> I think somehow that is related here.
>> 
>> 
>> e.g. how would you ensure forward progress in this:
>> 
>> 
>> EXCEPTION E1;
>> EXCEPTION E2;
>> EXCEPTION E3;
>> 
>> 
>> PROCEDURE F4() RAISES ANY =
>> CONST Function = "F4 ";
>> BEGIN
>> Put(Function & Int(Line())); NL();
>> TRY
>> Put(Function & Int(Line())); NL();
>> TRY
>> Put(Function & Int(Line())); NL();
>> TRY
>> Put(Function & Int(Line())); NL();
>> RAISE E1;
>> EXCEPT ELSE
>> RAISE E2;
>> END;
>> EXCEPT ELSE
>> RAISE E3;
>> END;
>> EXCEPT ELSE
>> END;
>> END F4;
>> 
>> 
>> Oddly in my test p251, the stack depth is not increased by TRY.
>> 
>> 
>> - Jay
>> 
>> ________________________________
>> Subject: Re: [M3commit] CVS Update: cm3
>> From: hosking at cs.purdue.edu
>> Date: Thu, 6 Jan 2011 09:22:09 -0500
>> CC: m3commit at elegosoft.com
>> To: jay.krell at cornell.edu
>> 
>> I am OK with what you have currently:
>> 
>> At each TRY:
>> 
>> 1. Check if a corresponding alloca block has been allocated by checking
>> if the corresponding local variable is NIL.
>> 2. If not, then alloca and save its pointer in the local variable
>> 3. Execute the try block.
>> 
>> As you say, alloca should turn into an inline operation using the
>> compiler's builtin implementation of alloca.
>> 
>> On Jan 6, 2011, at 1:02 AM, Jay K wrote:
>> 
>>> Code size will suffer.
>> 
>> 
>> Indeed. Unoptimized code size does suffer a lot, in functions that use try.
>> Calling alloca, unoptimized, isn't small, and this adds n calls for n trys.
>> I thought it'd only be one call. I didn't realize our implementation
>> is as poor as it is, since a better but still
>> portable implementation doesn't seem too too difficult.
>> 
>> 
>> Can we maybe do the optimizations I indicate -- no more than one
>> setjmp/alloca/pushframe per function?
>> Using a local integer to record the position within the function?
>> 
>> 
>> Or just give me a week or few to get stack walking working and then
>> live the regression on other targets?
>> (NT386 isn't likely to get stack walking, though it *is* certainly
>> possible; NT does have a decent runtime here..)
>> 
>> 
>> It *is* nice to not have have the frontend know about jmpbuf size.
>> 
>> 
>> I looked into the "builtin_setjmp" stuff, but it can't be used so easily.
>> It doesn't work for intra-function jumps, only inter-function.
>> 
>> 
>> - Jay
>> 
>> 
>> ________________________________
>> From: jay.krell at cornell.edu
>> To: hosking at cs.purdue.edu
>> CC: m3commit at elegosoft.com
>> Subject: RE: [M3commit] CVS Update: cm3
>> Date: Thu, 6 Jan 2011 04:52:33 +0000
>> 
>> Ah..I'm doing more comparisons of release vs. head...but..I guess your
>> point is, you'd rather have n locals, which the backend automatically
>> merges, than n calls to alloca?
>> It's not a huge difference -- there are still going to be n calls to
>> setjmp and n calls to pthread_getspecific.
>> The alloca calls will be dwarfed.
>> Code size will suffer.
>> 
>> 
>> And, even so, there are plenty of optimizations to be had, even if
>> setjmp/pthread_getspecific is used.
>> 
>> 
>> - It could make a maximum of one call to setjmp/pthread_getspecific
>> per function
>> - The calls to alloca could be merged. The frontend could keep track
>> of how many calls it makes per function,
>> issue a multiplication, and offset each jmpbuf. It is a tradeoff.
>> 
>> 
>> So, yes, given my current understanding, it is progress.
>> The target-dependence is not worth it, imho.
>> I'll still do some comparisons to release.
>> 
>> 
>> I'll still be looking into using the gcc unwinder relatively soon.
>> 
>> 
>> - Jay
>> 
>> 
>> ________________________________
>> Subject: Re: [M3commit] CVS Update: cm3
>> From: hosking at cs.purdue.edu
>> Date: Wed, 5 Jan 2011 21:14:17 -0500
>> CC: m3commit at elegosoft.com
>> To: jay.krell at cornell.edu
>> 
>> On Jan 5, 2011, at 9:08 PM, Jay K wrote:
>> 
>> Tony, um..well, um.. first, isn't that how it already worked maybe?
>> Declaring a new local EF1 for each TRY? It looks like it.
>> I'll do more testing.
>> 
>> Yes, it did. I assume you simply have a local variable for each TRY
>> block that is a pointer now instead of a jmp_buf. Should be OK.
>> 
>> 
>> So the additional inefficiency is multiplied the same as the rest of
>> the preexisting inefficiency.
>> And the preexisting inefficiency is way more than the increase.
>> 
>> And second, either way, it could be better.
>> 
>> Basically, the model should be, that if a function has any try or lock,
>> it calls setjmp once.
>> And then, it should have one volatile integer, that in a sense
>> represents the line number.
>> But not really. It's like, every time you cross a TRY, the integer is
>> incremented, every time you
>> cross a finally or unlock, the integer is decremented. Or rather, the
>> value can be stored.
>> And then there is a maximum of one one handler per function, it
>> switches on the integer
>> to decide where it got into the function and what it should do.
>> 
>> This is how other compilers work and it is a fairly simple sensible approach.
>> 
>> - Jay
>> 
>> 
>> ________________________________
>> Subject: Re: [M3commit] CVS Update: cm3
>> From: hosking at cs.purdue.edu
>> Date: Wed, 5 Jan 2011 20:49:24 -0500
>> CC: m3commit at elegosoft.com
>> To: jay.krell at cornell.edu
>> 
>> Note that you need a different jmpbuf for each nested TRY!
>> 
>> Antony Hosking | Associate Professor | Computer Science | Purdue University
>> 305 N. University Street | West Lafayette | IN 47907 | USA
>> Office +1 765 494 6001 | Mobile +1 765 427 5484
>> 
>> 
>> 
>> 
>> On Jan 5, 2011, at 8:33 PM, Jay K wrote:
>> 
>> oops, that's not how I thought it worked. I'll do more testing and fix
>> it -- check for NIL.
>> 
>> - Jay
>> 
>> ________________________________
>> Subject: Re: [M3commit] CVS Update: cm3
>> From: hosking at cs.purdue.edu
>> Date: Wed, 5 Jan 2011 20:23:09 -0500
>> CC: m3commit at elegosoft.com
>> To: jay.krell at cornell.edu
>> 
>> Ah, yes, I guess you need a different jmpbuf for each TRY. But now you
>> are allocating on every TRY where previously the storage was statically
>> allocated. Do you really think this is progress?
>> 
>> On Jan 5, 2011, at 5:40 PM, Jay K wrote:
>> 
>> I've back with full keyboard if more explanation needed. The diff is
>> actually fairly small to read.
>> I understand it is definitely less efficient, a few more instructions
>> for every try/lock.
>> No extra function call, at least with gcc backend.
>> I haven't tested NT386 yet. Odds are so/so that it works -- the change
>> is written so that it should work
>> but I have to test it to be sure, will to roughly tonight. And there
>> probably is a function call there.
>> 
>> - Jay
>> 
>> ________________________________
>> From: jay.krell at cornell.edu
>> To: hosking at cs.purdue.edu
>> Date: Wed, 5 Jan 2011 20:44:08 +0000
>> CC: m3commit at elegosoft.com
>> Subject: Re: [M3commit] CVS Update: cm3
>> 
>> I only have phone right now. I think it is fairly clear: the jumpbuf in
>> EF1 is now allocated with alloca, and a pointer stored. It is
>> definitely a bit less efficient, but the significant advantage is
>> frontend no longer needs to know the size or alignment of a jumpbuf.
>> 
>> 
>> As well, there is no longer the problem regarding jumpbuf aligned to
>> more than 64 bits. I at least checked on Linux/PowerPC and alloca seems
>> to align to 16 bytes. I don't have an HPUX machine currently to see if
>> the problem is addressed there.
>> 
>> 
>> The inefficiency of course can be dramatically mitigated via a stack
>> walker. I wanted to do this first though, while more targets using
>> setjmp.
>> 
>> - Jay/phone
>> 
>> ________________________________
>> Subject: Re: [M3commit] CVS Update: cm3
>> From: hosking at cs.purdue.edu
>> Date: Wed, 5 Jan 2011 13:35:59 -0500
>> CC: jkrell at elego.de; m3commit at elegosoft.com
>> To: jay.krell at cornell.edu
>> 
>> Can you provide a more descriptive checkin comment? I don't know what
>> has been done here without diving into the diff.
>> 
>> Antony Hosking | Associate Professor | Computer Science | Purdue University
>> 305 N. University Street | West Lafayette | IN 47907 | USA
>> Office +1 765 494 6001 | Mobile +1 765 427 5484
>> 
>> 
>> 
>> 
>> On Jan 5, 2011, at 9:37 AM, Jay K wrote:
>> 
>> diff attached
>> 
>>> Date: Wed, 5 Jan 2011 15:34:55 +0000
>>> To: m3commit at elegosoft.com
>>> From: jkrell at elego.de
>>> Subject: [M3commit] CVS Update: cm3
>>> 
>>> CVSROOT: /usr/cvs
>>> Changes by: jkrell at birch. 11/01/05 15:34:55
>>> 
>>> Modified files:
>>> cm3/m3-libs/m3core/src/C/Common/: Csetjmp.i3
>>> cm3/m3-libs/m3core/src/C/I386_CYGWIN/: Csetjmp.i3
>>> cm3/m3-libs/m3core/src/C/I386_MINGW/: Csetjmp.i3
>>> cm3/m3-libs/m3core/src/C/I386_NT/: Csetjmp.i3
>>> cm3/m3-libs/m3core/src/C/NT386/: Csetjmp.i3
>>> cm3/m3-libs/m3core/src/runtime/ex_frame/: RTExFrame.m3
>>> cm3/m3-libs/m3core/src/unix/Common/: Uconstants.c
>>> cm3/m3-sys/m3cc/gcc/gcc/m3cg/: parse.c
>>> cm3/m3-sys/m3front/src/misc/: Marker.m3
>>> cm3/m3-sys/m3front/src/stmts/: TryFinStmt.m3 TryStmt.m3
>>> cm3/m3-sys/m3middle/src/: M3RT.i3 M3RT.m3 Target.i3 Target.m3
>>> 
>>> Log message:
>>> use: extern INTEGER Csetjmp__Jumpbuf_size /* = sizeof(jmp_buf);
>>> alloca(Csetjmp__Jumpbuf_size)
>>> 
>>> to allocate jmp_buf
>>> 
>>> - eliminates a large swath of target-dependent code
>>> - allows for covering up the inability to declare
>>> types with alignment > 64 bits
>>> 
>>> It is, granted, a little bit slower, in an already prety slow path.
>>> Note that alloca isn't actually a function call, at least with gcc backend.
>>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 		 	   		  




More information about the M3commit mailing list