[M3devel] function pointers and comparison to nil? mis-typed function pointers?
Tony Hosking
hosking at cs.purdue.edu
Mon May 26 15:35:17 CEST 2008
Great summary Rodney.
Plus, trampolines typically require executable code on the stack which
is disallowed by some OSs.
On May 26, 2008, at 4:50 AM, Rodney M. Bates wrote:
> I think I can shed some light on this, having spent some time making
> m3gdb handle the various operations on nested procedures. As for code
> that mixes M3 and C, I believe the following are true:
>
> - Within M3 code only, the closure system works correctly in all
> cases.
> This answers one of Jay's worries.
>
> - For values of M3 procedure/C function pointer that are top-level
> (nonnested) procedures/functions, M3 and C code (generated by gcc,
> at least) are interchangeable. This answers another of Jay's
> worries.
> This will cover that great majority of cases.
>
> - Standard C has no nested functions. Gcc adds them as a language
> extension. Thus, only in gcc-compiled C code do we need to worry
> about nested procedures/functions between languages. (Do any other
> C compilers exist that also have nested functions?)
>
> - M3 code should be able to call the value of a procedure variable
> that was originally produced by C code as a pointer to a nested
> function, and get the environment set up right, so its nonlocal
> variable references will work, _if_ the nested function's
> environment has not disappeared. This partially answers another
> of Jay's worries. But:
>
> - M3's normal runtime check that precludes assigning a nonlocal
> procedure value will not detect a C-code-produced nonlocal value,
> thus the environment could indeed have disappeared if the programmer
> was not careful. However, gcc-extended C's nested functions have
> no protection against this bug when only C code is involved, so
> perhaps not detecting it in mixed M3/C is to be expected.
>
> - C code that attempts to call a function pointer value that was
> originally produced by M3 code as a nested procedure value will
> almost certainly crash at the time of the call. This is the only
> case that presents a significant problem. M3 code will not be
> able to give a nested procedure as a callback to a C library.
>
> M3's mechanism: A value of procedure type is a pointer to either
> 1) the first byte of the procedure's machine code, if it is top
> level, or
> 2) A closure.
>
> A closure is a block of 3 words. The first word contains -1.
> Assuming
> a word of all one bits is not valid machine code on any target machine
> (or at least doesn't occur as the first code of any procedure), this
> can
> be used at runtime to detect that this is indeed a closure and not
> code.
> The remaining two words hold the code address and the environment
> address.
>
> So, an assignment of a procedure value has to check that it is not a
> closure,
> and raise a runtime error if it is. A call on a procedure value has
> to check,
> and if it is a closure, load/copy its environment pointer value into
> wherever
> the calling convention expects it, then branch to the code address.
> Passing
> a nested procedure constant as a parameter involves constructing a
> closure for
> it and passing its address.
>
> gcc-C's mechanism: A value of type pointer to function is a pointer
> to either
> 1) the first byte of the function's machine code, if it is top level,
> (the same as with M3), or
> 2) A trampoline.
>
> A trampoline is a bit of code that loads/copies an environment
> pointer (hard
> coded into the trampoline) and then branches to the function code.
>
> Trampolines probably have a small constant-time speed advantage for
> _calls_,
> but would be slower for some of the other operations, and the
> runtime check
> could be tricky. Probably it could be fooled into failing when it
> shouldn't.
> Moreover, trampolines are highly target-machine-dependent.
> Switching to them
> would create a really big problem for m3gdb, which would have to have
> different code for each target for each of the nested procedure
> operations.
>
> Jay wrote:
>> I see somewhat.
>> It's stuff around "closure".
>> The comparison of code bytes to -1 comes from If_closure for example.
>> The problem is presumably to come up with a unified representation
>> of pointers to functions that may or may not be nested, while
>> avoiding runtime codegen, even just a little bit, and for Modula-3
>> and C function pointers to use the same representation.
>> I don't think the present solution is really valid, and I am
>> skeptical that there is a solution.
>> One of the requirements has to be dropped.
>> Sniffing code bytes and trying to decide if they are code or not as
>> appears to currently happen is bogus.
>> I think the solution is to remove the requirement that a Modula-3
>> function pointer and a C function pointer are the same.
>> Except, well, that probably doesn't work -- it means you need two
>> types of function pointers.
>> Darn this is a hard problem.
>> The runtime codegen required can be exceedingly simple, fast, and
>> small IF it were allowed to be on the stack. But that's a killer
>> these days.
>> I think you have to give up unification of "closures" and "function
>> pointers".
>> If you take the address of a nested function and call it, you
>> cannot access the locals of the enclosing scopes.
>> So in affect, you end up with "two types of function pointers".
>> Regular stateless ones and "closures" with some captured state.
>> Thoughts?
>> I'm kind of stumped. It's a desirable problem to solve, and there
>> is a purported solution in place, but the solution that is there is
>> completely bogus, despite appearing to work for a long time, and
>> there is no solution. That is my understanding. I could be wrong on
>> any number of points but I'm pretty sure.
>> I think you have to separate out function pointers and closures.
>> Sniffing what it pointed to is dubous esp. as currently implemented.
>> If this is really the way to go, then signature bytes need to be
>> worked out for all architectures that are guaranteed to not look
>> like code.
>> Or vice versa -- signature bytes worked out that all functions
>> start with, which is viable for Modula-3 but not for interop with C.
>> Currently -1 is used, of pointer-size.
>> That appears to be reasonable for x86:
>> 0:000> eb . ff ff ff ff
>> 0:000> u .
>> ntdll32!DbgBreakPoint:
>> 7d61002d ff ???
>> 7d61002e ff ???
>> 7d61002f ff ???
>> 7d610030 ffc3 inc ebx
>> but the instruction encodings or disassembly on other architectures
>> would have to be checked.
>> - Jay
>>
>> ------------------------------------------------------------------------
>> From: jayk123 at hotmail.com
>> To: m3devel at elegosoft.com
>> Date: Sun, 25 May 2008 00:16:01 +0000
>> Subject: [M3devel] function pointers and comparison to nil?
>> mis-typed function pointers?
>> I'm being lazy...
>> Tony can you explain this stuff?
>> Comparison of function pointers..
>> What are the various representations and rules?
>> What does it mean to compare nested functions?
>> What does it mean to compare a function to NIL?
>> I'll poke around more.
>> What I am seeing is that comparison of function pointers to NIL is
>> surprisingly
>> expensive, and probably somewhat buggy. Or at least some of the
>> runtime
>> generated "metadata-ish" stuff is produced or typed incorrectly.
>> In particular, RTLinker.m3:
>> PROCEDURE AddUnit (b: RT0.Binder) =
>> VAR m: RT0.ModulePtr;
>> BEGIN
>> IF (b = NIL) THEN RETURN END; line 119
>> m := b(0); line 120
>> IF (m = NIL) THEN RETURN END; line 121
>> AddUnitI(m); line 122
>> END AddUnit;
>> generates a lot of code, just for the first line:
>> (556) set_source_line source line 119 (557)
>> load m3cg_load (M3_DjPxE5_b): offset 0x0, convert 0xb ->
>> 0xb (558) load_nil (559) if_eq (560) load
>> m3cg_load (M3_DjPxE5_b): offset 0x0, convert 0xb -> 0xb (561)
>> load_indirect load address offset 0x0 src_t 0x5 dst_t
>> 0x5 (562) load_integer integer n_bytes 0x0 hi 0x0 low
>> 0x1 sign -1 (563) if_eq (564) set_label (565)
>> load_nil (566) load m3cg_load (M3_DjPxE5_b): offset
>> 0x0, convert 0xb -> 0xb (567) if_ne (568)
>> set_label (569) exit_proc (570) set_label (571)
>> set_source_line source line 120 The details on the
>> load_integer trace might not be completely
>> correct. I will test a fix shortly.
>> Esp. that n_bytes gets decremented to zero before the trace.
>> Ok, I see now why some of the bloat -- because the "then return
>> end"
>> is on the same line.
>> If it were written as:
>> if (b = NIL THEN return
>> end It probably wouldn't look so bad. That took me a
>> while to realize.
>> The following is generated for SPARC64_OPENBSD:
>> line 119
>> .stabn 68,0,119,.LLM61-.LLFBB4
>> .LLM61:
>> ldx [%fp+2175], %g1
>> brz %g1, .LL26
>> nop
>> ldx [%fp+2175], %g1
>> ldx [%g1], %g1 bus error here? yes, probably this one
>> cmp %g1, -1
>> be %xcc, .LL27
>> nop
>> .LL26:
>> ldx [%fp+2175], %g1
>> brz %g1, .LL33
>> nop
>> .LL27:
>> line 120
>> .stabn 68,0,120,.LLM62-.LLFBB4
>> .LLM62:
>> ldx [%fp+2175], %g1
>> stx %g1, [%fp+2007]
>> ldx [%fp+2007], %g1
>> brz %g1, .LL30
>> nop
>> ldx [%fp+2007], %g1
>> ldx [%g1], %g1 or here ?
>> cmp %g1, -1
>> bne %xcc, .LL30
>> nop
>> ldx [%fp+2007], %g1
>> add %g1, 16, %g1
>> ldx [%g1], %g1 or here?
>> stx %g1, [%fp+2015]
>> ldx [%fp+2007], %g1
>> add %g1, 8, %g1
>> ldx [%g1], %g1
>> stx %g1, [%fp+2007]
>> .LL30:
>> ldx [%fp+2007], %g1
>> ldx [%fp+2015], %g5
>> mov 0, %o0
>> call %g1, 0
>> nop
>> mov %o0, %g1
>> stx %g1, [%fp+2023]
>> ldx [%fp+2023], %g1
>> stx %g1, [%fp+1999]
>> line 121
>> .stabn 68,0,121,.LLM63-.LLFBB4
>> .LLM63:
>> ldx [%fp+1999], %g1
>> brz %g1, .LL33
>> nop
>> .LL32:
>> .stabn 68,0,122,.LLM64-.LLFBB4
>> .LLM64:
>> g1 points to RTSignal_I3
>> (gdb) x/i $pc
>> 0x3ff0a8 <RTLinker__AddUnit+28>: ldx [ %g1 ], %g1
>> (gdb) x/i $g1
>> 0x4021f4 <RTParams_I3>: save %sp, -208, %sp
>> I am willing to accept that a "function pointer" is a pair of
>> pointers, or even three pointers.
>> A pointer to code, a pointer to globals for position independent
>> code, a frame pointer to locals.
>> That equality comparison of function pointers requires comparing
>> two
>> (or three) pointers.
>> (Though the global pointer shouldn't need comparing.)
>> At least for nested functions. Less so for non-nested. ?
>> Much less for comparison to NIL. ?
>> And either way, this code is reading bogus data.
>> There isn't a pointer at the function address, there is code.
>> Something doesn't add up.
>> I'm going to try setting "aligned procedures" but that's quite
>> bogus
>> I think.
>> EqualExpr.m3 says
>> Note: procedures pointers are always aligned!
>> but maybe not?
>> Yeah yeah I'm being lazy. I'll read more code..
>> I also wonder if a "function pointer" can be optimized for the
>> case
>> of not being to a nested function.
>> It looks like calling a function pointer is very inefficient.
>> It looks like..am I reading that correctly?.. that if the pointer
>> points to -1, then it is nested and
>> a pair of pointers, and not otherwise. That -1 is treated
>> specially
>> as the first bytes of a function?
>> Is that a Modula-3-ism or a SPARC-ism?
>> It looks like a Modula-3-ism. And it seems dubious.
>> But I'll have to read more..
>> NT386GNU does the same sort of wrong looking thing:
>> LFBB4:
>> pushl %ebp
>> movl %esp, %ebp
>> subl $24, %esp
>> LBB5:
>> .stabn 68,0,117,LM60-LFBB4
>> LM60:
>> movl $0, -16(%ebp)
>> .stabn 68,0,119,LM61-LFBB4
>> LM61:
>> movl 8(%ebp), %eax
>> testl %eax, %eax
>> je L26
>> movl 8(%ebp), %eax
>> movl (%eax), %eax BAD
>> cmpl $-1, %eax BAD
>> je L27
>> L26:
>> movl 8(%ebp), %eax
>> testl %eax, %eax
>> je L33
>> L27:
>> .stabn 68,0,120,LM62-LFBB4
>> LM62:
>> and NT386:
>> 0:000> u
>> cm3!RTLinker__AddUnit:
>> 00607864 55 push ebp
>> 00607865 8bec mov ebp,esp
>> 00607867 81ec0c000000 sub esp,0Ch
>> 0060786d 53 push ebx
>> 0060786e 56 push esi
>> 0060786f 57 push edi
>> 00607870 c745fc00000000 mov dword ptr [ebp-4],0
>> 00607877 837d0800 cmp dword ptr [ebp+8],0
>> 0:000> u
>> cm3!RTLinker__AddUnit+0x17:
>> 0060787b 0f840f000000 je cm3!RTLinker__AddUnit+0x2c
>> (00607890)
>> 00607881 8b7508 mov esi,dword ptr [ebp+8]
>> 00607884 8b5e00 mov ebx,dword ptr
>> [esi] BAD 00607887
>> 83fbff cmp ebx,
>> 0FFFFFFFFh BAD
>> 0060788a 0f840f000000 je cm3!RTLinker__AddUnit+0x3b
>> (0060789f)
>> 00607890 837d0800 cmp dword ptr [ebp+8],0
>> 00607894 0f8505000000 jne cm3!RTLinker__AddUnit+0x3b
>> (0060789f)
>> 0060789a e969000000 jmp cm3!RTLinker__AddUnit+0xa4
>> (00607908)
>> cm3!RTLinker__AddUnit+0x20:
>> 00607884 8b5e00 mov ebx,dword ptr [esi] ds:002b:
>> 0062c950=81ec8b55
>> 0:000> u @esi
>> cm3!RTLinker_I3:
>> 0062c950 55 push ebp
>> 0062c951 8bec mov ebp,esp
>> 0062c953 81ec00000000 sub esp,0
>> 0062c959 53 push ebx
>> 0062c95a 56 push esi
>> 0062c95b 57 push edi
>> 0062c95c 837d0800 cmp dword ptr [ebp+8],0
>> 0062c960 0f8400000000 je cm3!RTLinker_I3+0x16 (0062c966)
>> This is just wrong.
>> Comparing bytes of code to -1.
>> I think the likely fix is for the "I3" code to be laid out
>> as a
>> "constant function pointer", a pointer to a pair of pointers where
>> one points to the code and one is to -1. Something like that. That
>> can't be quite correct given that the existing data is callable.
>> - Jay
>
> --
> -------------------------------------------------------------
> Rodney M. Bates, retired assistant professor
> Dept. of Computer Science, Wichita State University
> Wichita, KS 67260-0083
> 316-978-3922
> rodney.bates at wichita.edu
More information about the M3devel
mailing list