[M3devel] bind_segment vs. declare_segment?

Tue Nov 23 18:21:04 CET 2010

Jay K wrote:
>  > It's inevitable that we have types referenced before defined because of recursion in the type definitions.
>  
>  
> "Good". I mean, er, this is sort of what I thought, and sort of what the multi-pass in parse.c infrastructure is for.
> Er, but you said reference before defined, not referenced before forward declared.
>  
>  
> Is it inevitable, say, to have a pointer to a record before declaring that the record exists?
> Seems unlikely.
>  

It is inevitable that many programs will need to have types that refer to each other in
recursive fashion.  Just a simple linked list node needs a link pointer.  The pointer's
definition refers to the node's and vice-versa. with more than one node type, and links
among them, it gets more complicated.  Something has to come first, and it has to have
a forward reference to something that comes later.

There are various systems in various languages to support this.  Most languages allow some
way of splitting a type definition into two parts.  One gives only partial information about
the type, not enough to need the reference to the other type in the cycle.  This could be
the "declaring that the record exists" you mention.  The other type can be declared next,
referring to the incomplete type in some limited ways.  Then the completion of the first
type comes last.

For example, it is usually possible to define a pointer to an incompletely-defined type.
And this works OK for a compiler, which really only needs to know its size at this point,
and all pointers are the same size.  So we only need to know the referent is a type, but
not what type, to define a pointer to it.

Modula-3 is unusual in saying that all declarations in a single scope are made simultaneously.
This is rather abstract talk for saying one declaration can make a forward reference to another
declaration in the same scope.  (Note that this changes the way redeclaring an identifier in
an inner scope works.)

Then there are separate rules limiting what kinds of recursion are legal in
declarations.  For example, records R1 and R2 can contain pointers to each other,
but not imbedded instances of each other. (That would have infinite size.)

A compiler can handle this in two passes, one to collect all the identifiers declared
in the scope (and probably as much information about them as does not depend on any other
referred-to declaration.)  The second can then check the recursion rules and complete all
attributes of the declared entities.

>  
> Is there really recursion, or just..some graph?
>  
>  
> Again something the frontend maybe could resolve but doesn't?
>  
>  
> Basically, the question is, and you don't really have to answer it, I can figure it out, but does the backend need multiple passes to describe the types well? I thought I saw evidence of "yes", but I could be wrong. And it is probably fixable in the frontend if so. And then..the follow up question, depending on the various answers, is if you prefer to fix it in frontend or backend (not being sure yet there is a problem.....)
>  
>  
> A point here is, really, I'm still trying to get the better-typed trees.
> Even though I went out on a tangent getting configure -enable-checking to work.
> -enable-checking still is a bit broken (static link) but maybe good enough for now, good enough to return to the type work.
>  
>  
>  - Jay
> 
> 
> 
> 
> 
> ----------------------------------------
>> Subject: Re: [M3devel] bind_segment vs. declare_segment?
>> From: hosking at cs.purdue.edu
>> Date: Mon, 22 Nov 2010 21:59:29 -0500
>> CC: m3devel at elegosoft.com
>> To: jay.krell at cornell.edu
>>
>>
>> On Nov 22, 2010, at 8:51 PM, Jay K wrote:
>>
>>> Are you saying, like, the frontend has n passes, and one of them is codegen, and both of these actions happen within codegen.
>>> And, it could be "fixed" by adding an additional pass?
>> No, it wouldn't make sense to do that. Better to defer the type until it is *known*.
>>
>>> And, you didn't say this, we very well could/should, if all the backends needed it?
>>> Or if it was cheap enough?
>>>
>>>
>>>
>>> You know, effectively, I've added the extra pass anyway, in the backend that "needs" it.
>>> Or at least in the backend that could easily benefit from it.
>>> But actually there's probably another way.
>>> I could probably still do it one pass, just be sure to remember the type in declare and fill it in more in bind (instead of replacing it, after it has been used).
>> Yes, I think that would be better.
>>
>>> I think I'll try that. I'm not going to throw out the multi-pass ability, and I think it will have other better motivated uses, but this one might not really be needed.
>>> (Though it might yet be the multi-pass isn't needed. I thought I saw types referenced before defined. I'm pretty sure. It could be that the frontend could address that easily, or that it doesn't actually happen. Anyway, we'll see.
>> It's inevitable that we have types referenced before defined because of recursion in the type definitions.
>>
>>>
>>>
>>> - Jay
>>>
>>>
>>>
>>> ----------------------------------------
>>>> From: hosking at cs.purdue.edu
>>>> Date: Mon, 22 Nov 2010 20:17:54 -0500
>>>> To: jay.krell at cornell.edu
>>>> CC: m3devel at elegosoft.com
>>>> Subject: Re: [M3devel] bind_segment vs. declare_segment?
>>>>
>>>> It doesn't do multiple passes at code generation time. It simply declares the segment at beginning of code generation, and once it is done binds the size of the segment.
>>>>
>>>> Antony Hosking | Associate Professor | Computer Science | Purdue University
>>>> 305 N. University Street | West Lafayette | IN 47907 | USA
>>>> Office +1 765 494 6001 | Mobile +1 765 427 5484
>>>>
>>>>
>>>>
>>>>
>>>> On Nov 22, 2010, at 7:23 PM, Jay K wrote:
>>>>
>>>>> Tony, What surprises me a bit here is that the frontend already makes multiple passes over everything.
>>>>> I think. Right?
>>>>> So here it could do that just as well. Right?
>>>>>
>>>>> - Jay
>>>>>
>>>>>
>>>>> ________________________________
>>>>>> From: hosking at cs.purdue.edu
>>>>>> Date: Fri, 19 Nov 2010 09:43:03 -0500
>>>>>> To: jay.krell at cornell.edu
>>>>>> CC: m3devel at elegosoft.com
>>>>>> Subject: Re: [M3devel] bind_segment vs. declare_segment?
>>>>>>
>>>>>> declare_segment doesn't have the size because it is unknown at the time
>>>>>> of the declaration, which comes before the module is compiled.
>>>>>> Only as the module is compiled does the size become known, with
>>>>>> Bind_segment emitted at the end.
>>>>>>
>>>>>>
>>>>>> On Nov 19, 2010, at 5:01 AM, Jay K wrote:
>>>>>>
>>>>>> I'm not looking at m3front. I should.
>>>>>>
>>>>>> Why does declare_segment not have the size?
>>>>>>
>>>>>> I'm going to just live with whatever the frontend does for now.
>>>>>> Make an extra pass to find the bind_segments, so that when
>>>>>> I do things "for real", declare_segment can set the size correctly.
>>>>>>
>>>>>> Later I'll do better -- making the segment contain fields.
>>>>>> So that globals become debuggable, in stock gdb (as big records).
>>>>>>
>>>>>> But first I want configure enable-checking to work.
>>>>>> (with one exception, the static link stuff..)
>>>>>>
>>>>>> - Jay
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>