<html>
<head>
<style><!--
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
font-size: 12pt;
font-family:Calibri
}
--></style></head>
<body class='hmmessage'><div dir='ltr'> The C/C++ works, for a while now, and is improving nicely (wrt debuggability). <BR> <BR><br> Here are some current problems/dilemnas. <BR> <BR><br> --- getting pointer parameters correctly typed esp. on passing side --- <BR> <BR> <br> Up until now, any pointer parameter to a function has been typed as char*. <BR> <BR><br> (I have some preference for char* over void* because 1) it is valid pre-ANSI 2) you can do<br> math on them; however given that char* is usually wrong, void* actually debugs better,<br> showing nothing instead of garbage. gcc does allow math on void* but it<br> is an extension -- <a href="http://gcc.gnu.org/onlinedocs/gcc/Pointer-Arith.html#Pointer-Arith">http://gcc.gnu.org/onlinedocs/gcc/Pointer-Arith.html#Pointer-Arith</a>;<br> I should probably use void* with gcc and wherever the extension is supported...autoconf...)<BR> Anyway..<br> Every pointer passed is cast to char*. This applies to VAR, READONLY, and records-by-value (more later). <br> This works but is bad for debugging (again: void* is better than char*, but both are bad) <br> I've been working on this. <BR> <BR><br> The interface to the backend is <br> pop_param(cgtype) <br> pop_struct(typeid, size) <BR> <BR><br> non-struct readonly and var parameters are just: <br> pop_param(cgtype.addr)<BR> <BR><br> It would be nice if the frontend also passed a typeid here, <br> and either using the typeid from a "declare_indirect" or "declare_pointer" <br> or had separate booleans/flags for readonly/var. <BR> <BR><br> How about: <br> TYPE ParameterMode = {Value (* or Normal or None? *), Var, ReadOnly}; <br> PROCEDURE pop_param(cgtype: CGType; typeid: TypeUID; mode: ParameterMode); <BR> <BR> <BR> might as well merge this with pop_struct, no further change required probably.<br><BR> Or, at worst, another mode: <BR> TYPE ParameterMode = {Value (* or Normal or None? *), Var, ReadOnly, StructByValue}; <br> PROCEDURE pop_param(cgtype: CGType; typeid: TypeUID; mode: ParameterMode; bitSize: BitSize); <BR> <BR> <BR> or less abstract:<BR> TYPE ParameterMode = {Value, Pointer, StructByValue}; <br> PROCEDURE pop_param(cgtype: CGType; typeid: TypeUID; mode: ParameterMode; bitSize: BitSize); <BR> <BR> <BR> or, again, if declare_indirect/declare_pointer is used to twiddle the typeuid, this suffices,<br> I like it:<BR><br> (* bitSize, cgtype and typeid all imply size and agree, are somewhat redundant<br> Backends without type checking can ignore typeid. e.g. NTx86.m3.<br> bitSize is definitely redundant, but helps typeid-ignoring backends<br> that implement struct-by-value themselves e.g. NTx86.m3 easily adapt. <br> cgtype will be CGType.Addr for READONLY/VAR/ADDRESS/OBJECT/REF/TEXT, bitSize = sizeof(pointer)<br> cgtype will be CGType.Struct for struct-by-value (size in bitSize) <br> typeid will be declare_indirect/declare_pointer for READONLY/VAR (READONLY OBJECT?) <br> typeid will NOT be to a declare_indirect/declare_pointer for struct-by-value *) <br> PROCEDURE pop_param(cgtype: CGType; typeid: TypeUID; bitSize: BitSize); <BR> <BR><br> Ideally all backends would track typeids and it'd suffice to say: <br> (* typeid will be declare_indirect/declare_pointer for READONLY/VAR (READONLY OBJECT?) <br> typeid will NOT be to a declare_indirect/declare_pointer for struct-by-value *) <br> PROCEDURE pop_param(typeid: TypeUID); <BR><br> <BR> but I don't see that happening soon. In reality, CGType could go away entirely. Not soon.<br> (We'd need declare_integer(typeid, size, is_signed, is_word); declare_float(typeid, size),<br> and maybe a few others for some pointer types..REFANY, TEXT, MUTEX, etc.)<BR> <BR><br> I haven't looked to see if this information (typeid/size for pop_param)<br> is readily available in the frontend. I will do that soon. <BR> <BR><br> I have a few potential workarounds: <BR> cast to void*<br> This appears to be working, with limited testing (my new test case, p254)<br> Big drawback here is the code is no longer valid C++, only C.<br> This is ok temporarily if I'm making improvements otherwise, but I really want to output valid C++.<br> The function is still prototyped as taking stronger types, like INTEGER* or T1234* and C++<br> doesn't allow conversion from void* to other pointer types without a cast.<BR> <BR><br> Introspect on the function pointer type and cast appropriately, or even not at all. <br> This should provide the ideal output and is probably viable. I'll look into it later. <BR> <BR><br> Cast the function to (*)(...) for C++ or (*)() for C.<br> This is kind of gross. Hopefully it is not a deoptimization, but it might be. <br> I already do such casting for indirect function calls, for reasons to do with the static link. <br> I'm going to try this next.<BR> <BR> <BR> --- to depend on C passing/returning structs/records by value, or to do the copying ourselves? --- <BR><br> Now a minor dilemna, not a problem. <BR><br> <BR>Up until recently, I didn't have much type information flowing through the C backend.<br>Or specifically I was only using CGType and not TypeUID.<br>I'm now at the point where TypeUIDs and "almost everything" about them is kept track of.<br>(just some loose ends maybe around opaque types and object runtime type information.<BR> <BR><br> I had known record sizes, but not fields. Maybe this was a dilemna already before. <BR> <BR><br> Anyway, the point is, up until now, record passing and returning by value I have handled <br> internally by passing around pointers and making copies as needed (at function start). <BR> <BR><br> I forget exactly how returning works, I'll deal with that later. <BR> <BR><br> Passing works as follows: <br> caller passes pointer to record <br> callee has a local variable of that type <br> callee early on copies pointer to local variable, and references that thereafter <BR> <BR><br> This works and has almost no downside. <br> It is likely how the C compiler implements things anyway. <br> Except maybe for "small" records/structs. Some calling conventions <br> do allow for passing structs/records by value in registers. <BR><br> <BR> Passing structs/records by value is relatively rare, so we probably don't care much.<BR> <BR><br> Nevertheless, my question is, if I should go ahead and use the underlying C/C++ feature of<br> passing structs by value?<BR> <BR><br> There are multiple choices: <br> - no, leave it alone <br> - yes, change it unconditionally <br> - leave it as a const or var in M3C.m3 <br> - make it #ifdefed in the output .c <BR> <BR><br> I think for returning, we have similar choices, but the frontend is willing<br> to do the transform and currently does -- a matter of a boolean in Target.i3.<BR> <BR><br> This second question has equal quality debugging either way and needs no M3CG/frontend changes<br> either way. (though, you know, frontend is willing to make more transforms for<br> record return than record pass, I believe; it probably should be willing to do<br> more of the work.) <BR> <BR> <BR> Very old compilers don't support passing structs by value? <BR> Or don't do it thread-safely, passing the value through a global? ORCA/C for Apple IIGS I think.. <BR> <BR><br> --- #line directives or not? --- <BR> <BR> <BR> Third question that has been bugging me.<br> The C backend can output #line directives.<br> So you step through the Modula-3 source. What people expect.<br> This was working, and probably still does.<br> I turned it off subject to a constant in the backend. <br> Currently I output "//line" instead of "#line". (subject to the constant, and yes, I know // isn't portable C) <br><BR> This is great for, during backend development/debugging, the C compiler gives me C line numbers.<br> If the backend worked perfectly, this would be pointless.<br> I debug stuff *a lot* (beyond Modula-3) and I am sensitive to anything that inhibits debugging in any way.<br> There are bugs everywhere (I have seen them!) and everything needs to be debugged, both with logs and live. <br> <br> <br> What to do to cater to both/everyone? <br> I wish I could have multiple #line directives: <br> #line 123 foo.m3.c 456 foo.m3 <BR> <BR><br> but that doesn't exit. <BR> I could encode information in the file name:<br> #line 123 "foo.m3.c/456 foo.m3" <BR><br> <BR> but that is imperfect; error messages will be good, but debugging won't work<BR> <BR><br> I could leave it as an #ifdef in the code. <br> I do not believe the following works: <br><BR> #ifndef CLINE <br> #define LINE(cline, cfile, m3line, m3file) cline cfile <br> #else <br> #define LINE(cline, cfile, m3line, m3file) m3line m3file <br> #endif <BR> #line LINE(123, "foo.c", 456, "foo.m3") <BR> <BR><br> but I'll try it. <BR> <BR><br> I think the "best" ends up being to sprinkle in a steady stream of #ifdefs: <BR> #ifndef CLINE <br> #line 456 "foo.m3" (* might need to adjust by 1 to account for #endif *) <br> #endif <BR><br> This is bloated, but might be best. <BR> <BR><br> if "#define LINE" works, great, but I doubt it will. <BR> <BR><br> --- typeindex besides typeid? --- <BR> <BR><br>I'm now doing a lot of lookups of typeids.<br>It'd be super nice if the frontend also maintained "small" incrementing<br>typeIndices that I could use to index into an array.<BR> <BR><br> set_type_count(typeCount:CARDINAL); (* maybe *) <br> declare_object/pointer/indirect/record/etc.(typeId: TypeUID; typeIndex: CARDINAL; ...); <BR> <BR><br> and thereafter, use typeIndex instead of typeId, an index into an array. <BR> <BR><br> I've been tempted to ask for just: <BR> declare_object/pointer/indirect/record/etc.(typeIndex: CARDINAL; ...); <BR><br> <BR> but I realize that preserving the structural hash id is likely too useful/important,<br> either now or hypothetically.<BR><br> <BR>There is then the question as to if begin_unit/end_unit reset typeIndex.<br>This somewhat depends on how the frontend works.<BR><br> <BR>At some point I'd like to try outputing one C file across multiple units,<br>and add M3CG.begin_library, M3CG.end_library, M3CG.begin_program, M3CG.end_program,<br>M3CG.import_library(static | dynamic | unknown),<br>so the backend knows which units definitely link together,<br>and guide ELF visibility/__declspec(dllimport,dllexport).<BR> <BR><br> Given that, typeIndices would not reset upon end_unit. <br> There are challenges here, e.g. separate/incremental compilation.<br> I would like to amortize C compiler startup, as well, all the type declarations<br> would be shared across units, so the overall C source smaller.<br> Computer memory is vastly larger today than when CM3 was written and compilation<br> strategies have shifted significantly toward "whole program compilation".<br> We could do similar in the C backend..or leave it to the C compiler to try.<BR> <BR><br> - Jay<br><br><BR> </div></body>
</html>