[M3devel] function pointers and comparison to nil? mis-typed function pointers?

Tue May 27 17:45:51 CEST 2008

On Sun, May 25, 2008 at 10:50:15PM -0500, Rodney M. Bates wrote:
> I think I can shed some light on this, having spent some time making
> m3gdb handle the various operations on nested procedures.  As for code
> that mixes M3 and C, I believe the following are true:
> 
> - Within M3 code only, the closure system works correctly in all cases.
>   This answers one of Jay's worries.

As long as procedure-entry code can be guaranteed never to start wit a 
word of one-bits.  We do have some influence on this code, though, and 
if necessary might be able to choose a different bit pattern on a 
specific platform.

> 
> - For values of M3 procedure/C function pointer that are top-level
>   (nonnested) procedures/functions, M3 and C code (generated by gcc,
>   at least) are interchangeable.  This answers another of Jay's worries.
>   This will cover that great majority of cases.

Yes.  And in many cases, we will know statically that this is the case.

> 
> - Standard C has no nested functions.  Gcc adds them as a language
>   extension.  Thus, only in gcc-compiled C code do we need to worry
>   about nested procedures/functions between languages.  (Do any other
>   C compilers exist that also have nested functions?)

Standard C has no nested function, and does not need closures.  As a 
result, Standard C can use simple pointers to represent function 
addresses.  The language extension retrofits closures using run-time 
code generation.  The way this is done (on the satck) is nonportable, 
and we'd like to avoid that nonportability.

The problem with seems to be just that you seem to want to use an 
address of a function as a function pointer, and that just doesn't have 
enough space in it to represent a closure.

But why do it that way?  Why not just let *all* Modula 3 functions be 
represented by closures?  Then you never have to test whether something 
is a closure, it just always is.  Top-level functions with no 
environment just use a null pointer as environment -- and never use it.

The only arguments for not doing this would seem to be 
(a) the waste of space, making functions a little larger than necessary, 
(b) and C compatibility.

Now (a) is probably a nonissue, since the vast majority of functions 
never have their addresses taken, are never passes as parameters to 
other procedures, and so forth.  So for the vast majority of functions, 
you simply never have to build the closure.

(b) might be a problem. The obvious trick is just to forbid passing to C 
a non-top-level function.  Since even the C programmers who devise 
interfaces with callbacks realise that just a functino pointer isn't 
enough, they usually provide a machanism for passing a void pointer to 
additional information the callback might need.  Nothing here puts 
Modula 3 at a disadvantage relative to C.  You can just use a top-level 
function and let the programmer provide it with whatever it needs.

But if it is deemed essential to provide actual single-pointer callback 
addresses, this can be done by using a built-in function to convert a 
procedure to a single-pointer callback.  This functin will have to be 
rewritten for each platform, and can allocate the necessary 
dynamically-genereated code on the heap (or, of course, on the stack, if 
possible on that platform).  As for portability, it's certianly no 
big deal to do this;  compared with writing a platform-dependent code 
generator (itself a requirement), this is not huge.

> - M3's normal runtime check that precludes assigning a nonlocal
>   procedure value will not detect a C-code-produced nonlocal value,
>   thus the environment could indeed have disappeared if the programmer
>   was not careful.  However, gcc-extended C's nested functions have
>   no protection against this bug when only C code is involved, so
>   perhaps not detecting it in mixed M3/C is to be expected.

We really can't protect against bugs in C code.  If we could prevent 
bugs in C, the market for it would be huge, and we'd be well advised to 
consider that our main business.

> 
> - C code that attempts to call a function pointer value that was
>   originally produced by M3 code as a nested procedure value will
>   almost certainly crash at the time of the call.  This is the only
>   case that presents a significant problem.  M3 code will not be
>   able to give a nested procedure as a callback to a C library.

Wherefor only in this case should we do run-time code generation.

It has been argued that we don't need to protect against C programmers 
going hog-wild and breaking their own code.  Such is the nature of C.

But, we can chack for some of it, it we are willing to go to the effort.  
The technique used in the CDC Algol 68 compiler long ago might even 
enable us to restrict the constraint on assigning nested procedures to 
variables by a suitable run-time check.

The CDC Algol 68 compiler had a trick for recognising expired scopes 
using the garbage collector.  Let's see if I can remember the details.  
It involved special treatment for procedures whose addresses are taken, 
and for the blocks that contain them.  When entering such a block at 
run-time, a word is allocated on the heap representing that scope.   It 
is filled with something relevant, such as the address if the stack 
frame for the block, and the stack frame also pointed to that scope-cell 
(as I'll describe it).  Taking the  address of a procedure involved 
building a closure that points to that scope-cell.  When leaving the 
block, the contents of the scope-cell is wiped to some recognisable 
invalid value.  When entering the procedure the scope-cell will 
still be around even if the scope is not.  The procedure (this is inside 
the procedure itself, not at the call) checks that the scope-cell has 
not been wiped, and therefore is still valid.  If valid, it contains the 
necessary environment information.  If not, break off execution.

-- hendrik