<html>
<head>
<style><!--
.hmmessage P
{
margin:0px;
padding:0px
}
body.hmmessage
{
font-size: 12pt;
font-family:Calibri
}
--></style></head>
<body class='hmmessage'><div dir='ltr'><div>Agreed -- I forgot to acknowledge/repeat that this is related</div><div>to the first instruction in a function, not arbitrary instructions.</div><div><br></div><div><br></div><div>On the matter of 4 or 8 bytes though -- the thing is -- many but not all</div><div>64bit architectures are the exact same instruction set as 32 bits,</div><div>just a different selection of instructions.</div><div><br></div><div><br></div><div>Same size, same encodings, same alignment.</div><div><br></div><div><br></div><div>I believe this is true for alpha, mipa, ppc, sparc, hppa.</div><div><br></div><div>It is almost true for x86, though that kinda doesn't count,</div><div>since there is no fixed size or alignment requirement.</div><div><br></div><div><br></div><div>Fyi: x86/amd64 instructions are between 1 and 15 bytes in size,</div><div>inclusive, and all sizes in between. All the "lower" sizes</div><div>including 1 are common. Upper sizes are rare, maybe starting around 7 bytes.</div><div>The encoding scheme allows for longer instructions, but</div><div>there is a deliberate documented limit.</div><div><br></div><div><br></div><div>Not sure about arm -- arm64 at least gets rid of thumb.</div><div>So arm instructions either 4-aligned or at odd addresses,</div><div>and arm64 is like all the rest -- fixed size 4 byte aligned.</div><div><br></div><div><br></div><div>So 4 byte -1 e.g. on ppc32 has the same meaning as 8 byte -1 on ppc64,</div><div>just that it is two in a row.</div><div><br></div><div><br></div><div>It looks like we have agreement -- we can set this to a constant.</div><div><br></div><div><br></div><div>The marker value is -1 I checked.</div><div><br></div><div><br></div><div>I still find the function If_closure a little unclear/subtle.</div><div>I read through my comments in Target.m3 again.</div><div><br></div><div><br></div><div>It is false if:</div><div> - the system ever has alignment faults</div><div> - AND functions/instructions might be less aligned than closures </div><div><br></div><div><br></div><div>Some of the commented I added in the code are incorrect.</div><div><br></div><div><br></div><div>The comments say that a misaligned function pointer is read</div><div>one byte at a time. I believe the actual behavior is that</div><div>a misaligned function pointer is deemed not a closure,</div><div>and reading the marker is skipped.</div><div><br></div><div><br></div><div>The code is all still there, but branched around.</div><div>When aligned_procedures == true, less code is generated,</div><div>no alignment check is generated. </div><div><br></div><div><br></div><div>Most 32bit platforms set Aligned_procedures = true</div><div>and most 64bit platforms set it to false.</div><div><br></div><div><br></div><div> The exceptions:</div><div> amd64 is also true. </div><div> arm32 is false, due to Thumb mode. </div><div><br></div><div><br></div><div>Because most platforms, 32bit and 64bit, have fixed size</div><div>4 byte aligned instructions. So 32bit platforms have</div><div>instructions (and functions) therefore</div><div>aligned the same as closures. 64bit platforms generally</div><div>have closures more aligned than instructions.</div><div><br></div><div><br></div><div>I think if we ignored arm32, and penalized only amd64,</div><div>which granted is the most common platform,</div><div>we could set aligned_procedures = false for all 64bit,</div><div>true for all 32bit. And having another variable</div><div>that coincided with word size is ok.</div><div><br></div><div><br></div><div>But arm32's thumb instructions mean a function pointer</div><div>might be unaligned.</div><div><br></div><div><br></div><div>I'll make it always false.</div><div><br></div><div><br></div><div> > Method body procedures are required to be top-level</div><div> </div><div><br></div><div> Kind of a language limitation compared to "more dynamic" languages.</div><div> Definitely not trivial to do otherwise -- heap allocated garbage</div><div> collected stack frames and such..</div><div><br></div><div><br></div><div>Thank you,</div><div> - Jay</div><div><br></div><br><br><br><br><div>> Date: Sun, 30 Aug 2015 13:24:55 -0500<br>> From: rodney_bates@lcwb.coop<br>> To: m3devel@elegosoft.com<br>> Subject: Re: [M3devel] Target.Aligned_procedures and closure markers?<br>> <br>> <br>> <br>> On 08/30/2015 02:45 AM, Jay K wrote:<br>> > The agenda remains seeing if Target variables can be made constants.<br>> ><br>> > The discussion in this case is more complicated and some facts are unclear.<br>> ><br>> ><br>> > Background:<br>> ><br>> ><br>> > Nested functions are a problem.<br>> > In particular, if you can take their address.<br>> > Taking the address of a nested function presents a problem.<br>> > You are presented with two or three solutions.<br>> ><br>> ><br>> > - runtime code gen<br>> > - either on the stack<br>> ><br>> > - or somewhere more expensive, possibly with garbage collection<br>> ><br>> > - possibly "templated" instead of "arbitrary"; the meaning<br>> > of this is a lot to explain -- related to libffi and mremap, which<br>> > isn't entirely portable, but is e.g. portable to Windows<br>> ><br>> ><br>> > - or instead of runtime codegen, altering how function pointers are<br>> > called; you can only do this from Modula-3 code, not e.g. C or C++.<br>> ><br>> ><br>> > The solution Modula-3 has taken is to alter how funtion pointers are called.<br>> > The sequence is roughly:<br>> > Check if it is a "regular" function pointer or a "closure".<br>> > If it is a "regular" function pointer, just call it.<br>> > If it is a "closure", it contains a function pointer and a "static link".<br>> > Call the function pointer, passing the static link.<br>> ><br>> ><br>> > To tell if it is "regular" function pointer or a "closure", roughly<br>> > what is done is the data at the function pointer is read and compared<br>> > against a marker value. If it equals the marker value, it is a closure.<br>> ><br>> ><br>> > The size of the marker is the size of an integer or pointer (4 or 8 bytes).<br>> > The value of the marker checked for is 0 or -1, I'd have to check.<br>> > The alignment of the pointer might be a factor. In particular, on most<br>> > architectures, all instructions have a certain alignment. If the pointer has<br>> > less alignment, it can't be an instruction. Or maybe on those architectures,<br>> > the bytes are read one at a time to avoid alignment faults.<br>> ><br>> ><br>> > In particular, as far as I know, the following:<br>> > x86/amd64: no alignment of instructions, but functions maybe, but Modula-3 assumes functions aren't aligned<br>> ><br>> > ppc32/ppc64/alpha32/alpha64/mips32/mipa64/sparc32/sparc64/arm64/hppa32/hppa64 -- instructions are all<br>> > 4 bytes and 4 byte aligned, so functions are at least also as much<br>> ><br>> > arm32 -- instructions are 2 or 4 bytes; if they are 2 bytes, then the instruction<br>> > pointer is actually odd as well, and the low bit is removed to really find the instructions<br>> > That is -- instruction pointer is either odd or 4-aligned, never 2-aligned.<br>> ><br>> > ia64 -- instructions come in bundles of 3, they are 41 bits each, with a 5 bit "template" in each<br>> > bundle, for a total of 128 bits per bundle, likely always 128-bit-aligned<br>> ><br>> ><br>> > I could use confirmation on much of this.<br>> ><br>> ><br>> > I find the use of a marker value a little dubious. It'd be good to research if there is one<br>> > value that works on all.<br>> ><br>> ><br>> > I find the choice of a marker size to be pointer-sized dubious on most platforms.<br>> > In particular, most 64bit platforms have a 32bit instruction size, so using more than 32 bits<br>> > for the marker value doesn't buy much. If the marker value is actually a legal instruction,<br>> > then checking for two in a row reduces the odds of a false positive.<br>> ><br>> ><br>> > However, given that the closure is a marker and two pointers, it isn't like you are going<br>> > to pack the second and third 64bit field right up against a 32bit field. You'd want padding for alignmet.<br>> ><br>> <br>> Right. If the marker had smaller alignment than a pointer, say 32-bit marker, 64-bit pointers, then<br>> it would be necessary to start the closure on an odd multiple of 32 bits--a rule that is not part<br>> of anybody's alignment system of any compiler that I am aware of. So then you'd have to finesse it<br>> by giving the closure 64-bit alignment and starting with a pad word, which would fail to gain the<br>> space benefits. Moreover, fewer bits of marker increase the likelihood of its accidentally being a<br>> valid instruction or sequence thereof.<br>> <br>> So I think making the marker the same size as a pointer, and giving the whole closure pointer-sized<br>> alignment is the best way, unless/until we find a machine instruction set that has a known ambiguity<br>> here.<br>> <br>> Also, it is not necessary that there be no valid instruction sequence that starts with 32 or 64 1-bits.<br>> It is enough that no compiler produces it at the beginning of a prologue. Much harder to ascertain for<br>> certain (especially if we want to be able to call procedures produced by other compilers) but much less<br>> likely to result in a problem.<br>> <br>> Just a wild guess, but I would not be surprised if ELF and other object formats would require the<br>> machine code of a function/procedure to begin on a native word boundary, even if the hardware<br>> instruction set does not. Where so, this would obviate checking the alignment before checking<br>> for a closure, though probably target-dependently.<br>> <br>> ><br>> > If we are aiming for all out target-specificity, I'd suggest marker size be a target aspect,<br>> > and set it to 4 bytes for ppc64/mips64/sparc64/alpha64/arm64/hppa64.<br>> ><br>> ><br>> > However, I want less target-variation not more.<br>> ><br>> ><br>> > Here are some my lingering questions:<br>> > - Is the marker value actually invalid code on every platform? Does its value need to be target-specific?<br>> > - Is a 64bit marker value actually sufficient on IA64?<br>> > The way to help here, I think, is to ensure that a 64bit marker,<br>> > not a 128bit marker, contains the "template", and an invalid "template".<br>> > - Barring the previous, a solution might be to use a 128 bit marker on all platforms.<br>> ><br>> ><br>> > i believe all of these function pointers are rare.<br>> > I hope/believe the object method calls do not check for closures -- though actually<br>> > that is related to a useful language construct, that I doubt we have.<br>> ><br>> <br>> Method body procedures are required to be top-level, ensured statically, so there is no<br>> need for method call code to consider the possibility of the pointer in the object type<br>> to be a closure.<br>> <br>> ><br>> > The simplest solution is likely:<br>> > - ignore IA64, or research it further<br>> > - keep marker size at integer<br>> <br>> Pointer would be target-independent in getting the following two pointers aligned.<br>> <br>> > - for the C backend, assume no alignment of function pointers -- give up<br>> > any of the optimization, esp. x86/amd64.<br>> <br>> I think this optimization both applies to a low-frequency situation and has a very<br>> small benefit, so I would not worry about giving up on it.<br>> <br>> ><br>> ><br>> > For other than the C backend, maybe dial back marker size to 4 bytes for mips64/sparc64/alpha64/arm64/hppa64.<br>> > While I don't like target-specificity, notice this wouldn't check linux vs. bsd vs. solaris, etc. It isn't a cross produce thing.<br>> ><br>> ><br>> > Thoughts?<br>> ><br>> ><br>> > - Jay<br>> ><br>> ><br>> ><br>> > _______________________________________________<br>> > M3devel mailing list<br>> > M3devel@elegosoft.com<br>> > https://mail.elegosoft.com/cgi-bin/mailman/listinfo/m3devel<br>> ><br>> <br>> -- <br>> Rodney Bates<br>> rodney.m.bates@acm.org<br>> _______________________________________________<br>> M3devel mailing list<br>> M3devel@elegosoft.com<br>> https://mail.elegosoft.com/cgi-bin/mailman/listinfo/m3devel<br></div> </div></body>
</html>