[M3devel] better typing on SUBARRAY temporaries?

Rodney M. Bates rodney_bates at lcwb.coop
Sun Aug 2 02:43:43 CEST 2015



On 07/26/2015 02:23 AM, Peter McKinna wrote:
> So far not doing anything with free_temp, declare_temp just does an alloca but does it at the end of the first basic block to avoid dominate all uses problems. ( I have since discovered that there is a flag -alloca-hoisting which does this for you). The language ref hardly mentions temps except that they are referred to as unnamed values ie %1 %2 etc whether the optimiser does anything with these I dont know. I think I'm giving the temps names like %tmp.1 etc
>    The workaround I have at the moment is that if in the store, (which I think is the first place one of these temps is referenced) the var is a temp ie having NoID and is not declared in the current procedure then declare it ie alloca it. Seems to work but seems pretty kludgy.
>    My first thought when I struck this problem was that I should put all temps on the static link (since there is no parameter in declare_temp to say its uplevel and hence say which could be nested proc material) but that didnt work and really it was a pretty dumb idea. I cant see why the front end couldnt just declare a new temp in the finally proc.
>

Commit 496e9be1dcdcf87bda8e72239fc90132591b4cf4 fixes this, for this test case.

> Regards Peter
>
> On Sun, Jul 26, 2015 at 2:26 AM, Rodney M. Bates <rodney_bates at lcwb.coop <mailto:rodney_bates at lcwb.coop>> wrote:
>
>     More on this:
>
>     I appears that, in the CM3 IR, "variables" declared by declare_local and one
>     declared by declare_temp (and a few others, too), while declared with different
>     operators, are accessible interchangeably.  In this example, the FINALLY procedure
>     accesses both s and the temporary nonlocally, in the same way.  The difference
>     is, the temp, unlike a local variable, has the possibility that its space in
>     the activation record is freed prior to the end of the corresponding code.
>
>     What llvm IR are you translating declare_temp and free_temp into?   Llvm might
>     have different rules for temps.
>
>
>
>
>     On 07/25/2015 10:25 AM, Rodney M. Bates wrote:
>
>         I compiled this and looked at the cm3 IR for it.  At first glance, it looks like
>         a bug in the front end.  The FINALLY code is translated as a nested procedure,
>         inside the one containing the entire TRY..FINALLY statement.  The temp is created by
>         a declare_temp in the outer procedure, freed by a free_temp in the inner one,
>         and used in both, the 2 uses being unrelated.
>
>         In m3-sys/m3middle/src/M3CG_Ops.i3, I see:
>         -----------------------------------------------------------------------------
>            declare_temp (s: ByteSize;  a: Alignment;  t: Type;
>                         in_memory: BOOLEAN): Var;
>         (* declares an anonymous local variable.  Temps are declared
>              and freed between their containing procedure's begin_procedure and
>              end_procedure calls.  Temps are never referenced by nested procedures. *)
>                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
>         free_temp (v: Var);
>         (* releases the space occupied by temp 'v' so that it may be reused by
>              other new temporaries. *)
>         -----------------------------------------------------------------------------
>
>         And it also seems strange for the temp to be freed inside the nested procedure,
>         although this does not violate what the comment says.  The fact that every
>         temp has a unique variable number would take care of matching the free_temp to the
>         right declare_temp, and maybe the code generators can handle freeing a nonlocal
>         temp.
>
>         Apparently, this has caused no problems with preexisting code generators.
>         But it certainly looks wrong, and clearly violates the comments.
>
>         I recall that the gcc-derived code generator and the integrated x86 code generator
>         both unnest nested procedures, in opposite ways (one pulls them out in front, the
>         other in back), which might have something to do with how they handle this.
>
>         What happens in the llvm back end for a programmer-declared nested procedure
>         making a nonlocal reference to a programmer-declared local variable of the
>         containing procedure?  If you can handle this latter case, can you handle the
>         failing one the same way?  Maybe this is what is happening in the other code
>         generators, and the comment is just too strict.
>
>
>
>
>         On 07/24/2015 08:13 PM, Peter McKinna wrote:
>
>             On the subject of temporaries can I get your thoughts on the following?
>
>             TRY
>                 s := s & "a";
>             FINALLY
>                 s := s & "b";
>             END;
>
>             The front end declares a temp in the try block as part of its concat and then refers to the same temp in the finally block. The trouble is that the finally code is generated in a separate procedure and references a temp declared in the proc of the try. In llvm the first you know of the problem is a store to the temp which has not been declared.
>             Just wondering whether the front end should redeclare this temp? Also is the front end generating similar temps for other runtime operations?
>
>             Regards Peter
>
>             On Sat, Jul 25, 2015 at 12:57 AM, Rodney M. Bates <rodney_bates at lcwb.coop <mailto:rodney_bates at lcwb.coop> <mailto:rodney_bates at lcwb.coop <mailto:rodney_bates at lcwb.coop>>> wrote:
>
>
>
>                  On 07/24/2015 03:57 AM, Jay K wrote:
>
>                      I model this in C like:
>
>
>                      M3C.m3:
>                                print(self, "/*declare_open_array*/typedef struct {");
>
>
>                  Presumably, this comment means you have some way of knowing, in M3C, that this is an open array?
>
>                                print(self, element_type.text);
>                                print(self, "* _elts; CARDINAL _size");
>                                IF bit_size > Target.Integer.size * 2 THEN
>                                    print(self, "s[");
>                                    print(self, IntToDec((bit_size - Target.Integer.size) DIV Target.Integer.size));
>                                    print(self, "]");
>                                END;
>
>
>                      that is..and this i stinky that I get the struct "size",
>
>
>                      size == 2 * sizeof(integer):
>
>
>                      struct {
>                      T* elements
>                      size_t size;
>                      }
>
>
>                      else:
>                         N = size - sizeof(INTEGER) / sizeof(INTEGER)
>                         T ** elements; // where the number of star is N
>
>
>                  I would not do pointer to pointer to pointer ... here.  Just "T* elements", regardless
>                  of the number of open dimensions.  As I understand them, C's language-supported
>                  multidimensional arrays are like Modula3 multidimensional _fixed_ arrays, i.e.,
>                  the language does the multi-subscript address arithmetic, which means the type
>                  system needs to provide a static element size of each dimension.  And that,
>                  except for the innermost, depends on the element count of the next inner
>                  dimension, which is not static here.
>
>                  So, the multiple dimensions are essentially flattened into one, and access to A[I,J,K]
>                  is lowered by the front end into explicit address arithmetic into the flattened
>                  array, using the values in the shape.  Something like A[I*Shape[0]+J*Shape[1]+K]
>
>                          size_t sizes[N]
>
>
>                      It is kind of lame that the frontend just gives the overall size
>                      and the backend is just left to assume the layout like that.
>
>
>                  If you know in M3C that it's an open array, you can infer what the layout is.
>                  But yes, it's kind of lame.  This is just another of the several places we have
>                  seen that the front end has lowered things too far, and you have to
>                  You know it's generated by code in OpenArray.m3, and you know the general dope layout,
>                  and open dimension count, so you can generate an appropriate type.
>
>
>
>                      Really, the frontend should declare a type "pointer to pointer to pointer" with
>                      the right "depth", and then a record with that pointer and a size or fixed size array of sizes.
>
>
>                      Again, really ugly how it works now where backend is just given a size and can only
>                      assume the layout.
>
>                      I don't know what a "dope vector" is. A value used as an initializer?
>
>
>                  This is a very old and rather uninformative term for any block of stuff stored at runtime
>                  that describes some source programmer's real data and how to access it.  The only alternative
>                  term I can think of would be "metadata", although that is rather overgeneral, and is usually
>                  used with quite different specific meanings.  But it is data describing data.
>
>
>                      It is even worse for subarray. In this case we aren't even told it is an open array, just
>                      some random struct with a size. That is what I first want to fix. It should declare an open array,
>                      assuming they do have the same layout, which I think they do.
>
>
>                  What you really need to know is that it's an open array, of which subarray is a subcategory.
>                  In our implementation, all open array values have the same dope.  E.g., look for the case where
>                  a fixed array actual parameter is passed to an open array formal.
>
>
>                      subarray temporaries and jmpbufs are I believe the only place the frontend passes so little
>                      type information.
>
>
>                      For jmpbufs I'm hoping to notice their name, and, unfortunately expensive, replace them
>                      with #include <setjmp.h> and jmpbuf, instead of just a struct with an array of bytes.
>
>
>
>
>                         - Jay
>
>
>                        > Date: Wed, 22 Jul 2015 19:30:12 -0500
>                        > From: rodney_bates at lcwb.coop <mailto:rodney_bates at lcwb.coop> <mailto:rodney_bates at lcwb.coop <mailto:rodney_bates at lcwb.coop>>
>                        > To: m3devel at elegosoft.com <mailto:m3devel at elegosoft.com> <mailto:m3devel at elegosoft.com <mailto:m3devel at elegosoft.com>>
>                        > Subject: Re: [M3devel] better typing on SUBARRAY temporaries?
>                        >
>                        > I'm not exactly sure what you are asking, but here is some light on what
>                        > you are seeing. These temporaries are exactly the dope the compiler uses
>                        > to represent all open array values. First a pointer to the zeroth
>                        > array element, then the "shape", as defined in M3 definition, 2.2.3, i.e.
>                        > an array of element counts for each open subscript. For an open array
>                        > parameter, this would be the machine representation of the parameter
>                        > itself, authored in M3. (but passed by reference.) For a heap object,
>                        > it is stored right before the elements themselves. For a SUBARRAY
>                        > expression, it has to be a temporary. It also has to be constructed
>                        > at the call site, as an anonymous temporary, when passing an entire fixed
>                        > array to an open array parameter
>                        >
>                        > So, a good type for it might look like:
>                        >
>                        >
>                        > RECORD
>                        > Elements : REF ARRAY [0..Max, ... ,0..Max] OF ElementType
>                        > ; Shape : ARRAY [0..OpenDepth-1] of CARDINAL
>                        > END
>                        >
>                        > Max will be like the notorious TextLiteral.MaxBytes, i.e., we don't want any
>                        > static limit here in the type of Elements, as it will be enforced dynamically,
>                        > using Shape. But we don't want to just say REF ARRAY OF ElementType either,
>                        > as this would mean another open array inside the dope, resulting in infinite
>                        > recursion.
>                        >
>                        > On 07/22/2015 12:42 AM, Jay K wrote:
>                        > > In the C backend I have a notion of "weak struct types" and "strong struct types".
>                        > >
>                        > >
>                        > > "Strong" types have fields with types and names corresponding to the original Modula-3. i.e. they debug well.
>                        > >
>                        > >
>                        > > "Weak" types have just arrays of characters (in a struct), sized/aligned to what the front end asked for. i.e. they debug poorly.
>                        > >
>                        > >
>                        > >
>                        > > Originally I had only weak types.
>                        > > Ideally I have no weak types.
>                        > > I'm down to very few weak types now.
>                        > > I'd like to finish eliminating weak types.
>                        > >
>                        > >
>                        > >
>                        > > A quick investigation shows weak types come from open arrays and jmpbufs.
>                        > > Open array temporaries from SUBARRAY specifically.
>                        > >
>                        > >
>                        > >
>                        > > Can we fix this?
>                        > >
>                        > >
>                        > >
>                        > > We have:
>                        > > m3front/src/types/OpenArrayType.m3:
>                        > >
>                        > > PROCEDURE DeclareTemp (t: Type.T): CG.Var =
>                        > > VAR
>                        > > p := Reduce (t);
>                        > > size := Target.Address.pack + OpenDepth (p) * Target.Integer.pack;
>                        > > BEGIN
>                        > > RETURN CG.Declare_temp (size, Target.Address.align,
>                        > > CG.Type.Struct, in_memory := TRUE);
>                        > > END DeclareTemp;
>                        > >
>                        > >
>                        > > PROCEDURE Compiler (p: P) =
>                        > > VAR size := Target.Address.pack + OpenDepth (p) * Target.Integer.pack;
>                        > > BEGIN
>                        > > Type.Compile (p.element);
>                        > > CG.Declare_open_array (Type.GlobalUID(p), Type.GlobalUID(p.element), size);
>                        > > END Compiler;
>                        > >
>                        > >
>                        > > DeclareTemp is used in SUBARRAY expressions -- truly temporaries,
>                        > > not variables authored by anyone in Modula-3.
>                        > >
>                        > >
>                        > > Can this be easily fixed?
>                        > >
>                        > >
>                        > > Thanks,
>                        > > - Jay
>                        > >
>                        > >
>                        > >
>                        > > _______________________________________________
>                        > > M3devel mailing list
>                        > > M3devel at elegosoft.com <mailto:M3devel at elegosoft.com> <mailto:M3devel at elegosoft.com <mailto:M3devel at elegosoft.com>>
>                        > > https://mail.elegosoft.com/cgi-bin/mailman/listinfo/m3devel
>                        > >
>                        >
>                        > --
>                        > Rodney Bates
>                        > rodney.m.bates at acm.org <mailto:rodney.m.bates at acm.org> <mailto:rodney.m.bates at acm.org <mailto:rodney.m.bates at acm.org>>
>                        > _______________________________________________
>                        > M3devel mailing list
>                        > M3devel at elegosoft.com <mailto:M3devel at elegosoft.com> <mailto:M3devel at elegosoft.com <mailto:M3devel at elegosoft.com>>
>                        > https://mail.elegosoft.com/cgi-bin/mailman/listinfo/m3devel
>
>
>                  --
>                  Rodney Bates
>             rodney.m.bates at acm.org <mailto:rodney.m.bates at acm.org> <mailto:rodney.m.bates at acm.org <mailto:rodney.m.bates at acm.org>>
>                  _______________________________________________
>                  M3devel mailing list
>             M3devel at elegosoft.com <mailto:M3devel at elegosoft.com> <mailto:M3devel at elegosoft.com <mailto:M3devel at elegosoft.com>>
>             https://mail.elegosoft.com/cgi-bin/mailman/listinfo/m3devel
>
>
>
>
>     --
>     Rodney Bates
>     rodney.m.bates at acm.org <mailto:rodney.m.bates at acm.org>
>
>

-- 
Rodney Bates
rodney.m.bates at acm.org



More information about the M3devel mailing list