[M3commit] CVS Update: cm3

Thu Jan 6 21:33:21 CET 2011

Ok.
Do you know where to initialize the jmpbuf to NIL?

I have a diff that "works" (ie: doesn't crash) and is *close* to correct,
it checks for NIL and branches around the alloca for non-NIL, but it
also initializes to NIL repeatedly, so no change effectively.

 
Index: misc/Marker.m3
===================================================================
RCS file: /usr/cvs/cm3/m3-sys/m3front/src/misc/Marker.m3,v
retrieving revision 1.7
diff -u -w -r1.7 Marker.m3

--- misc/Marker.m3    5 Jan 2011 14:34:54 -0000    1.7
+++ misc/Marker.m3    6 Jan 2011 20:32:00 -0000
@@ -233,6 +233,7 @@
 
 PROCEDURE CaptureState (frame: CG.Var;  handler: CG.Label) =
   VAR new: BOOLEAN;
+      label := CG.Next_label ();
   BEGIN
     (* int setjmp(void* ); *)
     IF (setjmp = NIL) THEN
@@ -263,18 +264,25 @@
                                         Target.Word.cg_type, 0);
     END;
     
+    (* IF frame.jmpbuf = NIL THEN *)
+
+      CG.Load_nil ();
+      CG.Load_addr (frame, M3RT.EF1_jmpbuf);
+      CG.If_compare (Target.Address.cg_type, CG.Cmp.NE, label, CG.Maybe);
+
     (* frame.jmpbuf = alloca(Csetjmp__Jumpbuf_size); *)
     CG.Start_call_direct (alloca, 0, Target.Address.cg_type);
     CG.Load_int (Target.Word.cg_type, Jumpbuf_size);
     CG.Pop_param (Target.Word.cg_type);
     CG.Call_direct (alloca, Target.Address.cg_type);
-    CG.Store (frame, M3RT.EF1_jmpbuf, Target.Address.size, Target.Address.align,
-              Target.Address.cg_type);
+      CG.Store_addr (frame, M3RT.EF1_jmpbuf);
+
+    (* END *)
+    CG.Set_label (label);
 
     (* setmp(frame.jmpbuf) *)
     CG.Start_call_direct (setjmp, 0, Target.Integer.cg_type);
-    CG.Load (frame, M3RT.EF1_jmpbuf, Target.Address.size, Target.Address.align,
-             Target.Address.cg_type);
+    CG.Load_addr (frame, M3RT.EF1_jmpbuf);
     CG.Pop_param (CG.Type.Addr);
     CG.Call_direct (setjmp, Target.Integer.cg_type);
     CG.If_true (handler, CG.Never);
cvs diff: Diffing stmts
Index: stmts/TryFinStmt.m3
===================================================================
RCS file: /usr/cvs/cm3/m3-sys/m3front/src/stmts/TryFinStmt.m3,v
retrieving revision 1.6
diff -u -w -r1.6 TryFinStmt.m3
--- stmts/TryFinStmt.m3    5 Jan 2011 14:34:54 -0000    1.6
+++ stmts/TryFinStmt.m3    6 Jan 2011 20:32:00 -0000
@@ -299,6 +299,10 @@
     CG.Load_nil ();
     CG.Store_addr (frame, M3RT.EF1_info + M3RT.EA_exception);
 
+    (* no jmpbuf yet (avoid repeated alloca in try within loop) *)
+    CG.Load_nil ();
+    CG.Store_addr (frame, M3RT.EF1_jmpbuf);
+
     l := CG.Next_label (3);
     CG.Set_label (l, barrier := TRUE);
     Marker.PushFrame (frame, M3RT.HandlerClass.Finally);
Index: stmts/TryStmt.m3
===================================================================
RCS file: /usr/cvs/cm3/m3-sys/m3front/src/stmts/TryStmt.m3,v
retrieving revision 1.3
diff -u -w -r1.3 TryStmt.m3
--- stmts/TryStmt.m3    5 Jan 2011 14:34:54 -0000    1.3
+++ stmts/TryStmt.m3    6 Jan 2011 20:32:00 -0000
@@ -10,7 +10,7 @@
 
 IMPORT M3, M3ID, CG, Variable, Scope, Exceptionz, Value, Error, Marker;
 IMPORT Type, Stmt, StmtRep, TryFinStmt, Token;
-IMPORT Scanner, ESet, Target, M3RT, Tracer;
+IMPORT Scanner, ESet, Target, M3RT, Tracer, IO;
 FROM Scanner IMPORT Match, MatchID, GetToken, Fail, cur;
 
 TYPE
@@ -411,6 +411,10 @@
     CG.Store_addr (frame, M3RT.EF1_exception);
     ***********************************************)
 
+    (* no jmpbuf yet (avoid repeated alloca in try within loop) *)
+    CG.Load_nil ();
+    CG.Store_addr (frame, M3RT.EF1_jmpbuf);
+
     IF (p.hasElse) THEN
       Marker.PushTryElse (l, l+1, frame);
       Marker.PushFrame (frame, M3RT.HandlerClass.ExceptElse);


The set_label before one of the PushEFrames could be moved down a bit,
to after the NIL initialization, and that'd fix some cases, but I think not all.

Thanks,
- Jay


----------------------------------------
> Subject: Re: [M3commit] CVS Update: cm3
> From: hosking at cs.purdue.edu
> Date: Thu, 6 Jan 2011 14:11:04 -0500
> CC: m3commit at elegosoft.com
> To: jay.krell at cornell.edu
>
> At this point, we are trying to move away from the setjmp implementation to one that relies on unwind support, so I don't think the effort here is worthwhile.
>
> Antony Hosking | Associate Professor | Computer Science | Purdue University
> 305 N. University Street | West Lafayette | IN 47907 | USA
> Office +1 765 494 6001 | Mobile +1 765 427 5484
>
>
>
>
> On Jan 6, 2011, at 2:00 PM, Jay K wrote:
>
> >
> > I believe you can, but it'd take significant work in the frontend.
> > The jmpbuf should identify merely which procedure/frame to return to.
> > There would also be a volatile local integer, that gets altered at certain points through the function.
> > When setjmp returns exceptionally, you'd switch on that integer to determine where to "really" go.
> > This is analogous to how other systems work -- NT/x86 has a highly optimized frame based exception
> > handling. Instead of a generic thread local, FS:0 is reserved to be the head of the linked list of frames.
> > Instead of setjmp, the compiler pessimizes appropriately.
> >
> >
> > So the result is that a function with one or more tries, or one or more locals with destructors,
> > puts one node on the FS:0 list, and then mucks with the volatile local integer to indicate
> > where in the function it is.
> >
> >
> > If NT/x86 were inefficient more analogous to current Modula-3, it'd link/unlink in FS:0 more often.
> >
> >
> > It is more work through, granted, I can understand that.
> > And given that we have a much better option for many platforms, the payoff would be reduced.
> >
> >
> > Anyway, I'm trying what you say, like for TRY within a loop.
> >
> >
> > I should point out that alloca has an extra inefficiency vs. the previous approach.
> > It aligns more. So it is using more stack than the other way.
> > And it might pessimize codegen in other ways.
> >
> >
> > The gcc code appears somewhat similar..I think the tables merely describe, again, which
> > function/frame to return to, and that within the frame there is a local integer to determine
> > more precisely what to do. I'm not sure. I saw mention of a switch.
> >
> >
> > - Jay
> >
> > ________________________________
> >> Subject: Re: [M3commit] CVS Update: cm3
> >> From: hosking at cs.purdue.edu
> >> Date: Thu, 6 Jan 2011 13:52:42 -0500
> >> CC: m3commit at elegosoft.com
> >> To: jay.krell at cornell.edu
> >>
> >> You can't have one jmpbuf per procedure. You need one per TRY scope,
> >> since they can be nested.
> >>
> >>
> >>
> >> On Jan 6, 2011, at 11:35 AM, Jay K wrote:
> >>
> >> Hm. How do I single instance the "EF1"? The current code allocates a
> >> local "EF1" for each try.
> >> I guess, really, it is EF1, EF2, etc.
> >> So there should be a separate local for the jmpbuf pointer, and store
> >> it in each EF* block?
> >> How do I make just one jmpbuf pointer? I couldn't easily figure out how
> >> to in the front end, I need to read it more.
> >>
> >> something like:
> >>
> >> PROCEDURE F1() = BEGIN TRY1 do stuff1 TRY2 do stuff 2 TRY3 do stuff 3
> >> END END END END F1;
> >> =>
> >>
> >> void F1()
> >> {
> >> jmp_buf* jb = 0;
> >> EF1 a,b,c;
> >> setjmp(a.jmpbuf = jb ? jb : (jb = alloca(sizeof(jmp_buf))); // TRY1
> >> do stuff 1...
> >> setjmp(b.jmpbuf = jb ? jb : (jb = alloca(sizeof(jmp_buf))); // TRY2
> >> do stuff 2...
> >> setjmp(c.jmpbuf = jb ? jb : (jb = alloca(sizeof(jmp_buf))); // TRY3
> >> do stuff 3...
> >> }
> >>
> >> (The actual syntactic and semantic correctness of this code -- the
> >> existance of the ternary operator, and that it only evaluates one side
> >> or the other, and that assignment is expression..I quite like those
> >> features....)
> >>
> >>
> >> Still, something I can't pin down strikes me as too simple here.
> >>
> >>
> >> If there is just one setjmp, and no integer(s) to keep track of
> >> additional progress, you only ever know the last place you were in a
> >> function.
> >> That doesn't seem adequate.
> >>
> >>
> >> What if a function raises an exception, catches it within itself, and
> >> then raises something else, and then wants to catch that?
> >> It won't know where to resume, right? It's just keep longjmping to the
> >> same place.
> >>
> >>
> >> In the Visual C++ runtime, there is "local unwind" and "global unwind".
> >> "local unwind" is like, "within the same functin", "global unwind" is
> >> across functions.
> >> I think somehow that is related here.
> >>
> >>
> >> e.g. how would you ensure forward progress in this:
> >>
> >>
> >> EXCEPTION E1;
> >> EXCEPTION E2;
> >> EXCEPTION E3;
> >>
> >>
> >> PROCEDURE F4() RAISES ANY =
> >> CONST Function = "F4 ";
> >> BEGIN
> >> Put(Function & Int(Line())); NL();
> >> TRY
> >> Put(Function & Int(Line())); NL();
> >> TRY
> >> Put(Function & Int(Line())); NL();
> >> TRY
> >> Put(Function & Int(Line())); NL();
> >> RAISE E1;
> >> EXCEPT ELSE
> >> RAISE E2;
> >> END;
> >> EXCEPT ELSE
> >> RAISE E3;
> >> END;
> >> EXCEPT ELSE
> >> END;
> >> END F4;
> >>
> >>
> >> Oddly in my test p251, the stack depth is not increased by TRY.
> >>
> >>
> >> - Jay
> >>
> >> ________________________________
> >> Subject: Re: [M3commit] CVS Update: cm3
> >> From: hosking at cs.purdue.edu
> >> Date: Thu, 6 Jan 2011 09:22:09 -0500
> >> CC: m3commit at elegosoft.com
> >> To: jay.krell at cornell.edu
> >>
> >> I am OK with what you have currently:
> >>
> >> At each TRY:
> >>
> >> 1. Check if a corresponding alloca block has been allocated by checking
> >> if the corresponding local variable is NIL.
> >> 2. If not, then alloca and save its pointer in the local variable
> >> 3. Execute the try block.
> >>
> >> As you say, alloca should turn into an inline operation using the
> >> compiler's builtin implementation of alloca.
> >>
> >> On Jan 6, 2011, at 1:02 AM, Jay K wrote:
> >>
> >>> Code size will suffer.
> >>
> >>
> >> Indeed. Unoptimized code size does suffer a lot, in functions that use try.
> >> Calling alloca, unoptimized, isn't small, and this adds n calls for n trys.
> >> I thought it'd only be one call. I didn't realize our implementation
> >> is as poor as it is, since a better but still
> >> portable implementation doesn't seem too too difficult.
> >>
> >>
> >> Can we maybe do the optimizations I indicate -- no more than one
> >> setjmp/alloca/pushframe per function?
> >> Using a local integer to record the position within the function?
> >>
> >>
> >> Or just give me a week or few to get stack walking working and then
> >> live the regression on other targets?
> >> (NT386 isn't likely to get stack walking, though it *is* certainly
> >> possible; NT does have a decent runtime here..)
> >>
> >>
> >> It *is* nice to not have have the frontend know about jmpbuf size.
> >>
> >>
> >> I looked into the "builtin_setjmp" stuff, but it can't be used so easily.
> >> It doesn't work for intra-function jumps, only inter-function.
> >>
> >>
> >> - Jay
> >>
> >>
> >> ________________________________
> >> From: jay.krell at cornell.edu
> >> To: hosking at cs.purdue.edu
> >> CC: m3commit at elegosoft.com
> >> Subject: RE: [M3commit] CVS Update: cm3
> >> Date: Thu, 6 Jan 2011 04:52:33 +0000
> >>
> >> Ah..I'm doing more comparisons of release vs. head...but..I guess your
> >> point is, you'd rather have n locals, which the backend automatically
> >> merges, than n calls to alloca?
> >> It's not a huge difference -- there are still going to be n calls to
> >> setjmp and n calls to pthread_getspecific.
> >> The alloca calls will be dwarfed.
> >> Code size will suffer.
> >>
> >>
> >> And, even so, there are plenty of optimizations to be had, even if
> >> setjmp/pthread_getspecific is used.
> >>
> >>
> >> - It could make a maximum of one call to setjmp/pthread_getspecific
> >> per function
> >> - The calls to alloca could be merged. The frontend could keep track
> >> of how many calls it makes per function,
> >> issue a multiplication, and offset each jmpbuf. It is a tradeoff.
> >>
> >>
> >> So, yes, given my current understanding, it is progress.
> >> The target-dependence is not worth it, imho.
> >> I'll still do some comparisons to release.
> >>
> >>
> >> I'll still be looking into using the gcc unwinder relatively soon.
> >>
> >>
> >> - Jay
> >>
> >>
> >> ________________________________
> >> Subject: Re: [M3commit] CVS Update: cm3
> >> From: hosking at cs.purdue.edu
> >> Date: Wed, 5 Jan 2011 21:14:17 -0500
> >> CC: m3commit at elegosoft.com
> >> To: jay.krell at cornell.edu
> >>
> >> On Jan 5, 2011, at 9:08 PM, Jay K wrote:
> >>
> >> Tony, um..well, um.. first, isn't that how it already worked maybe?
> >> Declaring a new local EF1 for each TRY? It looks like it.
> >> I'll do more testing.
> >>
> >> Yes, it did. I assume you simply have a local variable for each TRY
> >> block that is a pointer now instead of a jmp_buf. Should be OK.
> >>
> >>
> >> So the additional inefficiency is multiplied the same as the rest of
> >> the preexisting inefficiency.
> >> And the preexisting inefficiency is way more than the increase.
> >>
> >> And second, either way, it could be better.
> >>
> >> Basically, the model should be, that if a function has any try or lock,
> >> it calls setjmp once.
> >> And then, it should have one volatile integer, that in a sense
> >> represents the line number.
> >> But not really. It's like, every time you cross a TRY, the integer is
> >> incremented, every time you
> >> cross a finally or unlock, the integer is decremented. Or rather, the
> >> value can be stored.
> >> And then there is a maximum of one one handler per function, it
> >> switches on the integer
> >> to decide where it got into the function and what it should do.
> >>
> >> This is how other compilers work and it is a fairly simple sensible approach.
> >>
> >> - Jay
> >>
> >>
> >> ________________________________
> >> Subject: Re: [M3commit] CVS Update: cm3
> >> From: hosking at cs.purdue.edu
> >> Date: Wed, 5 Jan 2011 20:49:24 -0500
> >> CC: m3commit at elegosoft.com
> >> To: jay.krell at cornell.edu
> >>
> >> Note that you need a different jmpbuf for each nested TRY!
> >>
> >> Antony Hosking | Associate Professor | Computer Science | Purdue University
> >> 305 N. University Street | West Lafayette | IN 47907 | USA
> >> Office +1 765 494 6001 | Mobile +1 765 427 5484
> >>
> >>
> >>
> >>
> >> On Jan 5, 2011, at 8:33 PM, Jay K wrote:
> >>
> >> oops, that's not how I thought it worked. I'll do more testing and fix
> >> it -- check for NIL.
> >>
> >> - Jay
> >>
> >> ________________________________
> >> Subject: Re: [M3commit] CVS Update: cm3
> >> From: hosking at cs.purdue.edu
> >> Date: Wed, 5 Jan 2011 20:23:09 -0500
> >> CC: m3commit at elegosoft.com
> >> To: jay.krell at cornell.edu
> >>
> >> Ah, yes, I guess you need a different jmpbuf for each TRY. But now you
> >> are allocating on every TRY where previously the storage was statically
> >> allocated. Do you really think this is progress?
> >>
> >> On Jan 5, 2011, at 5:40 PM, Jay K wrote:
> >>
> >> I've back with full keyboard if more explanation needed. The diff is
> >> actually fairly small to read.
> >> I understand it is definitely less efficient, a few more instructions
> >> for every try/lock.
> >> No extra function call, at least with gcc backend.
> >> I haven't tested NT386 yet. Odds are so/so that it works -- the change
> >> is written so that it should work
> >> but I have to test it to be sure, will to roughly tonight. And there
> >> probably is a function call there.
> >>
> >> - Jay
> >>
> >> ________________________________
> >> From: jay.krell at cornell.edu
> >> To: hosking at cs.purdue.edu
> >> Date: Wed, 5 Jan 2011 20:44:08 +0000
> >> CC: m3commit at elegosoft.com
> >> Subject: Re: [M3commit] CVS Update: cm3
> >>
> >> I only have phone right now. I think it is fairly clear: the jumpbuf in
> >> EF1 is now allocated with alloca, and a pointer stored. It is
> >> definitely a bit less efficient, but the significant advantage is
> >> frontend no longer needs to know the size or alignment of a jumpbuf.
> >>
> >>
> >> As well, there is no longer the problem regarding jumpbuf aligned to
> >> more than 64 bits. I at least checked on Linux/PowerPC and alloca seems
> >> to align to 16 bytes. I don't have an HPUX machine currently to see if
> >> the problem is addressed there.
> >>
> >>
> >> The inefficiency of course can be dramatically mitigated via a stack
> >> walker. I wanted to do this first though, while more targets using
> >> setjmp.
> >>
> >> - Jay/phone
> >>
> >> ________________________________
> >> Subject: Re: [M3commit] CVS Update: cm3
> >> From: hosking at cs.purdue.edu
> >> Date: Wed, 5 Jan 2011 13:35:59 -0500
> >> CC: jkrell at elego.de; m3commit at elegosoft.com
> >> To: jay.krell at cornell.edu
> >>
> >> Can you provide a more descriptive checkin comment? I don't know what
> >> has been done here without diving into the diff.
> >>
> >> Antony Hosking | Associate Professor | Computer Science | Purdue University
> >> 305 N. University Street | West Lafayette | IN 47907 | USA
> >> Office +1 765 494 6001 | Mobile +1 765 427 5484
> >>
> >>
> >>
> >>
> >> On Jan 5, 2011, at 9:37 AM, Jay K wrote:
> >>
> >> diff attached
> >>
> >>> Date: Wed, 5 Jan 2011 15:34:55 +0000
> >>> To: m3commit at elegosoft.com
> >>> From: jkrell at elego.de
> >>> Subject: [M3commit] CVS Update: cm3
> >>>
> >>> CVSROOT: /usr/cvs
> >>> Changes by: jkrell at birch. 11/01/05 15:34:55
> >>>
> >>> Modified files:
> >>> cm3/m3-libs/m3core/src/C/Common/: Csetjmp.i3
> >>> cm3/m3-libs/m3core/src/C/I386_CYGWIN/: Csetjmp.i3
> >>> cm3/m3-libs/m3core/src/C/I386_MINGW/: Csetjmp.i3
> >>> cm3/m3-libs/m3core/src/C/I386_NT/: Csetjmp.i3
> >>> cm3/m3-libs/m3core/src/C/NT386/: Csetjmp.i3
> >>> cm3/m3-libs/m3core/src/runtime/ex_frame/: RTExFrame.m3
> >>> cm3/m3-libs/m3core/src/unix/Common/: Uconstants.c
> >>> cm3/m3-sys/m3cc/gcc/gcc/m3cg/: parse.c
> >>> cm3/m3-sys/m3front/src/misc/: Marker.m3
> >>> cm3/m3-sys/m3front/src/stmts/: TryFinStmt.m3 TryStmt.m3
> >>> cm3/m3-sys/m3middle/src/: M3RT.i3 M3RT.m3 Target.i3 Target.m3
> >>>
> >>> Log message:
> >>> use: extern INTEGER Csetjmp__Jumpbuf_size /* = sizeof(jmp_buf);
> >>> alloca(Csetjmp__Jumpbuf_size)
> >>>
> >>> to allocate jmp_buf
> >>>
> >>> - eliminates a large swath of target-dependent code
> >>> - allows for covering up the inability to declare
> >>> types with alignment > 64 bits
> >>>
> >>> It is, granted, a little bit slower, in an already prety slow path.
> >>> Note that alloca isn't actually a function call, at least with gcc backend.
> >>>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >
>