<html>

<head>

<style><!--

.hmmessage P

{

margin:0px;

padding:0px

}

body.hmmessage

{

font-size: 12pt;

font-family:Calibri

}

--></style></head>

<body class='hmmessage'><div dir='ltr'><strong>  >  M3CG.pop_param(CGType, TypeUID, BitSize); </strong><br> <br><br>I was able to get away without this because I know<br>the signature of the function for direct calls.<br><br><br>So far I don't strongly type function pointers.<br>And because of the static link, I'm not sure I can.<br>I'm not sure it matters.<br>For debugging, it only matters when you step through<br>to the called function, and I declare them all now quite<br>typefully.<br><br><br>I still think this is a very good idea though.<br><br><br> - Jay<br><br><br><div><div id="SkyDrivePlaceholder"></div><hr id="stopSpelling">From: jay.krell@cornell.edu<br>To: hosking@cs.purdue.edu<br>Date: Thu, 11 Apr 2013 22:19:23 +0000<br>CC: m3devel@elegosoft.com<br>Subject: Re: [M3devel] many matters big and small esp. wrt C backend<br><br>


<style><!--

.ExternalClass .ecxhmmessage P {

padding:0px;

}


.ExternalClass body.ecxhmmessage {

font-size:12pt;

font-family:Calibri;

}


--></style>

<div dir="ltr">It is a mix of proposals and what-do-people-prefer-among-working-options and<br>Let me try again. Just some of it.<br><br> <br><strong>1) Stronger typing on pop_param.</strong><br><br> <br>Given<br>INTERFACE I;<br>PROCEDURE A(VAR a:INTEGER);<br>PROCEDURE B() VAR b; BEGIN A(b); END B;<br> <br><br>I want<br>  void I__A(INTEGER* a);  <br>  void I__B() { INTEGER b; A__I(&b) or A__I((INTEGER*)&b);  } <br> <br><br>unnecessary casts are ok.<br> <br> <br>This is not directly supported in M3CG:<br>  current: M3CG.pop_param(type := M3CG.Addr); (* Addr of what ? *) <br> <br> <br> but it might be indirectly supported, i.e. if I look at the function type. <br>  Currently, on my machine, I cast to void*, but that isn't valid C++, only C.<br>  I might be able to cast the function itself to "untyped" and get away with it, but<br>  that is kind ugly too.<br><br> <br> proposal something like: <br>  M3CG.pop_param(CGType, TypeUID); <br> <br> <br> but, furthermore, pop_struct doesn't need to be separate, so just: <br><strong>   M3CG.pop_param(CGType, TypeUID, BitSize); </strong><br> <br> <br> and remove pop_struct. <br> <br> <br> Much longer term, merely: <br>   M3CG.pop_param(TypeUID); <br><br> suffices. CGType is generally redundant with TypeUIDs, however <br> existing backends ignore TypeUIDs and get by with CGType. <br> <br> <br><strong>2) Preference for how to handle passing structs by value in C backend?</strong><br> <br><br>There are two obvious choices.<br>I'm doing it "manually" because, perhaps, I didn't have good type information.<br>I have good type information now, so I can use the C/C++ feature of passing structs<br>by value, instead of passing a pointer and copying into a local.<br> <br><br>Works either way.<br>No M3CG interface change.<br> <br> <br><strong>3) Small dense TypeIndexes mostly-but-not-entirely in place of TypeUIDs.</strong><br> <br> <br> TypeUIDs imply a lot of "lookups" in the backend. Slow seeming.<br> It'd be nice if we had a "linear" TypeIndex as well, that could be indices into a "full" array. <br> <br><br>Proposal:<br> TypeIndex = CARDINAL; (* index into an array *) (* a separate typename here isn't all that valuable *) <br> M3CG.declare_typeid or declare_typeindex or declare_type(TypeUID, TypeIndex);<br> and possibly<br>  M3CG.declare_type_count(CARDINAL); (* maximum value of TypeIndex + 1, backends can allocate<br>  arrays of this size and the index by later TypeIndex *)<br><br> <br>TypeIndexes should take on the values roughly [0..N] where N is the number<br>of types in the "program" (or unit..)<br>and then replace TypeUID everywhere else with TypeIndex.<br> <br><br>Depending on how the frontend flows, it might not be able to compute TypeCount early enough.<br>That is, I don't know if m3cg calls happen "during compilation" or only "at the end".<br> <br><br>As well, this does imply likely the same perf/lookups in the frontend.<br>Moving rather than eliminating cost.<br>However, it'd save it from multiple backends, and the frontend might already be paying this cost,<br>I haven't looked yet.<br> <br><br>It is ok and works today, but I'd really rather have "small" dense integers that can index into<br>an array than "random" integers that I'm forced to use something like a hash table or binary<br>search a sorted array for.<br> <br><br>Thanks,<br> - Jay<br><br><br><br><br> <br><div><div id="ecxSkyDrivePlaceholder"></div><hr id="ecxstopSpelling">CC: m3devel@elegosoft.com<br>From: hosking@cs.purdue.edu<br>Subject: Re: [M3devel] many matters big and small esp. wrt C backend<br>Date: Fri, 12 Apr 2013 07:12:42 +1000<br>To: jay.krell@cornell.edu<br><br><div>Hi Jay,</div><div><br></div><div>Is there any chance you could distill this stream of consciousness into an organized proposal of alternatives?  I find it difficult to extract your precise proposals and arguments from this.</div><div><br></div><div>--Tony<br><br>Sent from my iPad</div><div><br>On Apr 12, 2013, at 4:57 AM, Jay K <<a href="mailto:jay.krell@cornell.edu">jay.krell@cornell.edu</a>> wrote:<br><br></div><blockquote><div>


<style><!--

.ExternalClass .ecxhmmessage P {

padding:0px;

}


.ExternalClass body.ecxhmmessage {

font-size:12pt;

font-family:Calibri;

}


--></style>

<div dir="ltr"> The C/C++ works, for a while now, and is improving nicely (wrt debuggability). <br> <br><br> Here are some current problems/dilemnas. <br> <br><br>  --- getting pointer parameters correctly typed esp. on passing side --- <br> <br> <br> Up until now, any pointer parameter to a function has been typed as char*.  <br> <br><br> (I have some preference for char* over void* because 1) it is valid pre-ANSI 2) you can do<br>  math on them; however given that char* is usually wrong, void* actually debugs better,<br>  showing nothing instead of garbage. gcc does allow math on void* but it<br>  is an extension -- <a href="http://gcc.gnu.org/onlinedocs/gcc/Pointer-Arith.html#Pointer-Arith" target="_blank">http://gcc.gnu.org/onlinedocs/gcc/Pointer-Arith.html#Pointer-Arith</a>;<br>  I should probably use void* with gcc and wherever the extension is supported...autoconf...)<br> Anyway..<br> Every pointer passed is cast to char*. This applies to VAR, READONLY, and records-by-value (more later). <br> This works but is bad for debugging (again: void* is better than char*, but both are bad) <br> I've been working on this. <br> <br><br> The interface to the backend is <br>  pop_param(cgtype)  <br>  pop_struct(typeid, size)  <br> <br><br> non-struct readonly and var parameters are just: <br>   pop_param(cgtype.addr)<br> <br><br> It would be nice if the frontend also passed a typeid here, <br> and either using the typeid from a "declare_indirect" or "declare_pointer" <br> or had separate booleans/flags for readonly/var. <br> <br><br> How about: <br>  TYPE ParameterMode = {Value (* or Normal or None? *), Var, ReadOnly};   <br>  PROCEDURE pop_param(cgtype: CGType; typeid: TypeUID; mode: ParameterMode);  <br> <br> <br> might as well merge this with pop_struct, no further change required probably.<br><br> Or, at worst, another mode: <br>  TYPE ParameterMode = {Value (* or Normal or None? *), Var, ReadOnly, StructByValue};   <br>  PROCEDURE pop_param(cgtype: CGType; typeid: TypeUID; mode: ParameterMode; bitSize: BitSize);  <br> <br> <br> or less abstract:<br>  TYPE ParameterMode = {Value, Pointer, StructByValue}; <br>  PROCEDURE pop_param(cgtype: CGType; typeid: TypeUID; mode: ParameterMode; bitSize: BitSize);  <br> <br> <br> or, again, if declare_indirect/declare_pointer is used to twiddle the typeuid, this suffices,<br> I like it:<br><br>  (* bitSize, cgtype and typeid all imply size and agree, are somewhat redundant<br>     Backends without type checking can ignore typeid. e.g. NTx86.m3.<br>     bitSize is definitely redundant, but helps typeid-ignoring backends<br>       that implement struct-by-value themselves e.g. NTx86.m3 easily adapt. <br>     cgtype will be CGType.Addr for READONLY/VAR/ADDRESS/OBJECT/REF/TEXT, bitSize = sizeof(pointer)<br>     cgtype will be CGType.Struct for struct-by-value (size in bitSize) <br>     typeid will be declare_indirect/declare_pointer for READONLY/VAR (READONLY OBJECT?) <br>     typeid will NOT be to a declare_indirect/declare_pointer for struct-by-value *) <br>  PROCEDURE pop_param(cgtype: CGType; typeid: TypeUID; bitSize: BitSize);  <br> <br><br> Ideally all backends would track typeids and it'd suffice to say: <br>  (* typeid will be declare_indirect/declare_pointer for READONLY/VAR (READONLY OBJECT?) <br>     typeid will NOT be to a declare_indirect/declare_pointer for struct-by-value *) <br>  PROCEDURE pop_param(typeid: TypeUID);  <br><br> <br> but I don't see that happening soon. In reality, CGType could go away entirely. Not soon.<br> (We'd need declare_integer(typeid, size, is_signed, is_word); declare_float(typeid, size),<br> and maybe a few others for some pointer types..REFANY, TEXT, MUTEX, etc.)<br> <br><br> I haven't looked to see if this information (typeid/size for pop_param)<br> is readily available in the frontend.  I will do that soon. <br> <br><br> I have a few potential workarounds: <br> cast to void*<br>  This appears to be working, with limited testing (my new test case, p254)<br>  Big drawback here is the code is no longer valid C++, only C.<br>  This is ok temporarily if I'm making improvements otherwise, but I really want to output valid C++.<br>  The function is still prototyped as taking stronger types, like INTEGER* or T1234* and C++<br>   doesn't allow conversion from void* to other pointer types without a cast.<br> <br><br>  Introspect on the function pointer type and cast appropriately, or even not at all. <br>    This should provide the ideal output and is probably viable. I'll look into it later. <br> <br><br> Cast the function to (*)(...) for C++ or (*)() for C.<br>    This is kind of gross. Hopefully it is not a deoptimization, but it might be.  <br>    I already do such casting for indirect function calls, for reasons to do with the static link.  <br>    I'm going to try this next.<br> <br> <br>  --- to depend on C passing/returning structs/records by value, or to do the copying ourselves?  --- <br><br> Now a minor dilemna, not a problem.  <br><br> <br>Up until recently, I didn't have much type information flowing through the C backend.<br>Or specifically I was only using CGType and not TypeUID.<br>I'm now at the point where TypeUIDs and "almost everything" about them is kept track of.<br>(just some loose ends maybe around opaque types and object runtime type information.<br> <br><br> I had known record sizes, but not fields. Maybe this was a dilemna already before. <br> <br><br> Anyway, the point is, up until now, record passing and returning by value I have handled <br> internally by passing around pointers and making copies as needed (at function start). <br> <br><br> I forget exactly how returning works, I'll deal with that later. <br> <br><br> Passing works as follows: <br>   caller passes pointer to record  <br>   callee has a local variable of that type  <br>   callee early on copies pointer to local variable, and references that thereafter  <br> <br><br> This works and has almost no downside. <br> It is likely how the C compiler implements things anyway. <br> Except maybe for "small" records/structs. Some calling conventions <br> do allow for passing structs/records by value in registers. <br><br> <br> Passing structs/records by value is relatively rare, so we probably don't care much.<br> <br><br> Nevertheless, my question is, if I should go ahead and use the underlying C/C++ feature of<br>  passing structs by value?<br> <br><br> There are multiple choices: <br>   - no, leave it alone <br>   - yes, change it unconditionally <br>   - leave it as a const or var in M3C.m3 <br>   - make it #ifdefed in the output .c <br> <br><br> I think for returning, we have similar choices, but the frontend is willing<br> to do the transform and currently does -- a matter of a boolean in Target.i3.<br> <br><br> This second question has equal quality debugging either way and needs no M3CG/frontend changes<br>  either way. (though, you know, frontend is willing to make more transforms for<br>  record return than record pass, I believe; it probably should be willing to do<br>  more of the work.) <br> <br> <br> Very old compilers don't support passing structs by value?  <br> Or don't do it thread-safely, passing the value through a global? ORCA/C for Apple IIGS I think.. <br> <br><br>  --- #line directives or not? --- <br> <br> <br> Third question that has been bugging me.<br> The C backend can output #line directives.<br> So you step through the Modula-3 source. What people expect.<br> This was working, and probably still does.<br>   I turned it off subject to a constant in the backend. <br> Currently I output "//line" instead of "#line". (subject to the constant, and yes, I know // isn't portable C) <br><br> This is great for, during backend development/debugging, the C compiler gives me C line numbers.<br> If the backend worked perfectly, this would be pointless.<br> I debug stuff *a lot* (beyond Modula-3) and I am sensitive to anything that inhibits debugging in any way.<br> There are bugs everywhere (I have seen them!) and everything needs to be debugged, both with logs and live. <br> <br> <br> What to do to cater to both/everyone? <br> I wish I could have multiple #line directives: <br>  #line 123 foo.m3.c 456 foo.m3 <br> <br><br> but that doesn't exit. <br> I could encode information in the file name:<br>  #line 123 "foo.m3.c/456 foo.m3" <br><br> <br> but that is imperfect; error messages will be good, but debugging won't work<br> <br><br> I could leave it as an #ifdef in the code. <br> I do not believe the following works: <br><br>  #ifndef CLINE  <br>  #define LINE(cline, cfile, m3line, m3file) cline cfile <br>  #else <br>  #define LINE(cline, cfile, m3line, m3file) m3line m3file <br>  #endif <br>  #line LINE(123, "foo.c", 456, "foo.m3") <br> <br><br> but I'll try it. <br> <br><br> I think the "best" ends up being to sprinkle in a steady stream of #ifdefs: <br>  #ifndef CLINE  <br>  #line 456 "foo.m3" (* might need to adjust by 1 to account for #endif *)   <br>  #endif  <br><br>  This is bloated, but might be best.  <br> <br><br>  if "#define LINE" works, great, but I doubt it will. <br> <br><br> --- typeindex besides typeid? --- <br> <br><br>I'm now doing a lot of lookups of typeids.<br>It'd be super nice if the frontend also maintained "small" incrementing<br>typeIndices that I could use to index into an array.<br> <br><br> set_type_count(typeCount:CARDINAL); (* maybe *) <br> declare_object/pointer/indirect/record/etc.(typeId: TypeUID; typeIndex: CARDINAL; ...); <br> <br><br> and thereafter, use typeIndex instead of typeId, an index into an array. <br> <br><br> I've been tempted to ask for just: <br> declare_object/pointer/indirect/record/etc.(typeIndex: CARDINAL; ...); <br><br> <br> but I realize that preserving the structural hash id is likely too useful/important,<br> either now or hypothetically.<br><br> <br>There is then the question as to if begin_unit/end_unit reset typeIndex.<br>This somewhat depends on how the frontend works.<br><br> <br>At some point I'd like to try outputing one C file across multiple units,<br>and add M3CG.begin_library, M3CG.end_library, M3CG.begin_program, M3CG.end_program,<br>M3CG.import_library(static | dynamic | unknown),<br>so the backend knows which units definitely link together,<br>and guide ELF visibility/__declspec(dllimport,dllexport).<br> <br><br> Given that, typeIndices would not reset upon end_unit. <br> There are challenges here, e.g. separate/incremental compilation.<br> I would like to amortize C compiler startup, as well, all the type declarations<br> would be shared across units, so the overall C source smaller.<br> Computer memory is vastly larger today than when CM3 was written and compilation<br> strategies have shifted significantly toward "whole program compilation".<br> We could do similar in the C backend..or leave it to the C compiler to try.<br> <br><br> - Jay<br><br><br>                                      </div>

</div></blockquote></div>                                       </div></div>                                        </div></body>

</html>