[M3devel] backend issues wrt garbage collection

Jay K jay.krell at cornell.edu
Sun Apr 21 23:12:04 CEST 2013


1) Stack can/should be assumed contiguous.
We don't handle IA64's two stacks properly though -- they grow up and down from same location. The one that grows up is for register spills.



2) It is very real. It has been working for months, building the entire system.


Our Modula-3 implementation is perhaps "surprising".
>From a certain point of view.


There are two special structs per module, refered to as "segments", that contain the global read only data and global writable data. I would like to fix/change this at some point.


There are significant disadvantages in the current scheme, and there are possible advantages, and reasons it is hard to change/fix.


The disadvantages: none of the fields in the segments are named; it is highly non-debuggable.


references to globals are via offsets -- the frontend has done the layout; fixing it, making them either named or separate variables, will require frontend work.


For non-garbage collected data, I really don't see the point.
For garbage collected data, there might be offsets recorded in yet more places. This either have to be preserved, or the offsets changed to pointers. There are advantages/disadvantages either way. Offsets can usually be smaller, e.g. only 32 bits on a 64bit target, and offsets imply an efficiency in the executable file and loading it -- i.e. fewer relocs or less indirection for position independent code. But ignoring those points and thinking from an idealized C point of view, separate data and pointers would be nice.


Furthermore, separate variables give us a chance the compiler/linker can throw out unused data.


However, the model here actually resembles what C compilers are forced to usually do anyway.
Specifically, in a .c/.cpp file, you will generally get all the read only data linked in, or none of it, and all the writable data, or none of it. Linkers do NOT generally strip unused data.


Given the code:
int i;
void *p = &foo;
void foo() { }


if "i" is referenced outside the file, then p and foo will be linked in as well, even if they are not otherwise used.


In Microsoft Visual C++ you can do a bit better.
You can mark constant ininitialized data as "__declspec(selectany)".
The direct implied meaning is one thing, but the indirect meaning includes that it will be dead-stripped -- only linked in if referenced. I don't know how to do this with other compilers.


The point is, these Modula-3 "segments" seem to closely emulate what most?all C compilers do anyway. But, they do it in a way that is guaranteed to be this way, even if some C compilers are different.


However, I don't really see value in this approach. For C and C++ I think it is mostly subtle and unknown to programmers and causes the linker to bring in more code/data than is needed/expected. However, I expect it is also impossible to change without breaking some programs that depend on it.
For Modula-3, I'm less worried about the changed semantics.


Point being, if/when we separate out globals from "segments", I would be inclined to mark all the constants as __declspec(selectany), unless we know something further up the chain won't ever output unused data.


We might also be able to win by using #pragma segment, a different segment for each data.

 > One is - are global variables per module kept in contiguous space per module?


Yes.
Two separate segments, for constants and non-constants.


 > Also, are those global variable spaces kept contiguous regardless of optimizations?



Yes, probably, mostly. Depending on the underlying object model.
Generally the underying system will have at least three sections/segments in the underyling object model -- read only executable code, read only non-executable data, writable non-executable data.
Sometimes the read only data will be merged with the code.

The readonly data and writable data will not necessarily be contiguous, and/or there could be a small gap between them.


It really depends on the underlying system.


But why does it matter?


 - Jay


From: dragisha at m3w.org
Date: Sun, 21 Apr 2013 17:29:17 +0200
To: m3devel at elegosoft.com
Subject: [M3devel] backend issues wrt garbage collection

LLVM has a notion of a stacklet. So, stack is - possibly - non contiguous and as such it is not compatible with current GC.  Fixable, but still an issue.
Also, with C/C++ being a planned/developed backend, two issues emerge. One is - are global variables per module kept in contiguous space per module? Another is - what is compiler/backend/ABI/whatever of this final C/C++ compiler step and what it's stack is like? :) Also, are those global variable spaces kept contiguous regardless of optimizations?

--Dragiša Durićdragisha at m3w.org



 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20130421/3ae238bc/attachment-0002.html>


More information about the M3devel mailing list