[M3devel] m3front scanner div wierdness?

Rodney M. Bates rodney_bates at lcwb.coop
Thu Jun 30 18:12:11 CEST 2016



On 06/30/2016 12:24 AM, Jay K wrote:
> I think I get it now. I was/am missing some of the lines but I can imagine how it works.
>
> There are about 15 bits given to the file number and about 17 bits given to the line number.
> On a 32bit system.
>
> A file with more than 100,000 lines might have trouble and a package with more than 16,000 files might have trouble.
> I know 64,000 lines in generated .c files is not unheard of.
> I don't know what file counts are on the high end.
>

I think 100,000 lines could be a bit marginal for mechanically generated code,
but the file count space is probably over generous.  I presume they made it a
power of 10 so a human could do the DIV or MOD visually on the decimal value.

Increasing to a million would still give 2147 files.

> It is almost a good place for a LONGINT, but
> TYPE SourceLocation = RECORD
>    file, line: INTEGER or INT32 := 0;
> END?
>
> Ok to do this at some point?

It will be pervasive and truly algorithmic.  An alternative would be to
keep it an INTEGER, but use a distinct type name on all variables/fields
that use the MOD/DIV invariant.  Pervasive too, but no risk of runtime breakage.

If you put just the one INTEGER inside a record with an unlikely name, that
would preserve the space savings but still get type checking and still get
the compiler to find all the places that need to change.

>
>   - Jay
>
>
> ----------------------------------------
>> From: jay.krell at cornell.edu
>> To: rodney.m.bates at acm.org; m3devel at elegosoft.com
>> Date: Wed, 29 Jun 2016 16:31:19 +0000
>> Subject: Re: [M3devel] m3front scanner div wierdness?
>>
>> My first computer, at home, had 80k, and it was oddly high end, most people had 48k or 64k.
>> (in either case, the address space was 16 bit and ROM and periphal memory was in there,
>> so various bank switching employed to gain access; later similar with the 128k RAM machines...)
>>
>>
>> I won't just change one place and break all the others, but maybe we should try to split it everywhere as you suggest (and recombine
>> as you suggest).
>>
>>
>> I know that 32bits is overkill for a line number.
>> I also know 16 bits isn't overkill -- but more than 16 are used here so ok.
>> There is/was a warning in the Visual C++ compiler about truncating line numbers
>> or terminating debug information after 64k lines. Midl output would trigger it.
>>
>> I still don't follow completely.
>> It seems there is an aliasing situation, where lines very far apart can be deemed the same.
>>
>> I'll look closer though, as e.g. 20 bits would seem enough for a line number and 20 bits for a file number.
>> Or maybe the answer is 32 bits in general, and the 64bit machines can move the pair together...
>> In general you have to balance:
>> 1 System still compilable on 32bit.
>> 2 vs. 64bit system can do more
>>
>> For the first case, you want to limit integers to 32bits and for the second you do not.
>>
>> Also convenience and perf suggest *not* having to sprinkle div/mod all around,
>> though granted, div/mod by a constant is emininently optimizable, at least 32 bit operations...
>>
>>> FileNoAndLineNo
>>
>> Lately this is called "location" and things even have starts and ends, so the error messages can output
>> a line and then point out the parts of the line. I'm not sure if this is obviously good and nice or overkill
>> but clang is there and I think gcc went there. Yes the data is larger.
>>
>> Maybe shorter FileAndLine?
>> I realize it is ambiguous, they could both be strings, or line could be a file offset (a useful quantity!)
>>
>> - Jay
>>
>>
>> ----------------------------------------
>>> Date: Wed, 29 Jun 2016 11:09:34 -0500
>>> From: rodney_bates at lcwb.coop
>>> To: m3devel at elegosoft.com
>>> Subject: Re: [M3devel] m3front scanner div wierdness?
>>>
>>>
>>>
>>> On 06/28/2016 10:54 PM, Jay K wrote:
>>>> Does anyone understand this stuff in m3front/Scanner.m3:
>>>>
>>>> Here vs. LocalHere?
>>>> SameFile?
>>>>
>>>>
>>>> I understand only this nuance:
>>>> offset MOD MaxLines
>>>>
>>>> MaxLines = 100000;
>>>>
>>>>
>>>> is to crudely handle that when asserts fail,
>>>> they pack the line number in with the assertion failure code,
>>>> potentially loosing bits.
>>>>
>>>>
>>>> I don't think this is a good design, they should just be separate INTEGERS,
>>>> but this is besides the point.
>>>>
>>>
>>> This is just pure speculation, but I am very confident of it. These
>>> offsets have a very high occurrence count. There is code all over m3front
>>> that copies Scanner.offset into various data structures. So the small space
>>> saving of one INTEGER instead of two would be multiplied by a big number.
>>> I remember working in Modula-3 on a company-paid computer with 16 Meg of ram.
>>> Today, I have 8 Gig in the one I bought, and could easily afford more, if I
>>> thought I needed it.
>>>
>>> Two integers would be cleaner, but this design is not too bad *if* you know
>>> the MOD/DIV invariant. It is commented at Scanner.m3:54, but only for one
>>> field. As pure documentation, there really should be a distinct type name
>>> (say FileNoAndLineNo?) for all values that use this representation, even
>>> though it just equates to INTEGER. There are a lot of variables lying around
>>> all over the front end that use this invariant, but are just declared as
>>> INTEGER. That's maintainer-hostile.
>>>
>>>
>>>>
>>>> What doesn't makes sense to me is the machinations around file name.
>>>>
>>>>
>>>> Here:
>>>> file := files [offset DIV MaxLines];
>>>>
>>>> vs. LocalHere:
>>>> file := local_files [fnum];
>>>>
>>>>
>>>> LocalHere makes sense. Here does not.
>>>>
>>>>
>>>> PROCEDURE SameFile (a, b: INTEGER): BOOLEAN =
>>>> BEGIN
>>>> RETURN (a DIV MaxLines) = (b DIV MaxLines);
>>>> END SameFile;
>>>>
>>>>
>>>>
>>>> Shouldn't this just be a = b?
>>>>
>>>
>>> As coded, this will return TRUE if a and b are different line numbers within
>>> the same file. The name "SameFile" at least suggests that is what is intended.
>>> A good example of a place where it would have been clearer if a & b were
>>> declared as the type name I proposed above.
>>>
>>>>
>>>> Well, anyway, this SameFile function isn't called.
>>>>
>>>> Here and LocalHere are used.
>>>>
>>>>
>>>> I'm looking here because I want to add a temporary measure
>>>> such that the file names are leaf-only.
>>>>
>>>>
>>>> In particular, because generic modules have target names in their paths
>>>> and I want to temporarly remove target names from output, so I can prove
>>>> that a few targets are identical.
>>>>
>>>>
>>>> I guess, really, I propose the interface to assertion failures be expanded to take the line number separate from the failure code.
>>>> This has to percolate quite a bit through the system -- the backends and the runtime.
>>>>
>>>>
>>>> And then this Here vs. LocalHere difference should fall away.
>>>> But still, what is it trying to do?
>>>>
>>>>
>>>> Thank you,
>>>> - Jay
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> M3devel mailing list
>>>> M3devel at elegosoft.com
>>>> https://m3lists.elegosoft.com/mailman/listinfo/m3devel
>>>>
>>>
>>> --
>>> Rodney Bates
>>> rodney.m.bates at acm.org
>>> _______________________________________________
>>> M3devel mailing list
>>> M3devel at elegosoft.com
>>> https://m3lists.elegosoft.com/mailman/listinfo/m3devel
>>
>> _______________________________________________
>> M3devel mailing list
>> M3devel at elegosoft.com
>> https://m3lists.elegosoft.com/mailman/listinfo/m3devel
>   		 	   		
> _______________________________________________
> M3devel mailing list
> M3devel at elegosoft.com
> https://m3lists.elegosoft.com/mailman/listinfo/m3devel
>

-- 
Rodney Bates
rodney.m.bates at acm.org



More information about the M3devel mailing list