[M3devel] Your recent change to parse.c

Jay jayk123 at hotmail.com
Wed Apr 30 05:25:01 CEST 2008


Tony, another thing to try here is:
TREE_PUBLIC (p) = (lev == 0);
I should have noticed that, and can't confirm right now if itis what I think -- nested procedures and finally/except blocksI assume are (lev > 0).
I still think calls through the PLT should be drastically reducedand hope to look into this further.
In particular, calls that are known to resolve in the same .soneed not be through the PLT, at least on AMD64. I have to check others.
There is a paper dsohowto.pdf that talks about this stuff.
  So it's not just me.
I need to read it more closely but I think it resolves to simply
that the front end does know what is in the same .so and can pass
the information on to cm3cg.
 - Jay


CC: jkrell at elegosoft.com; m3devel at elegosoft.comFrom: hosking at cs.purdue.eduTo: jayk123 at hotmail.comSubject: Re: [M3devel] Your recent change to parse.cDate: Tue, 29 Apr 2008 18:32:28 -0400
Jay, thanks for your explanation!   If your log message could have had some of this information I might have understood what the need was.
OK, I've gone back through the logs and see that I did not put this in for specific optimization reasons, though it may have had a historic basis in making something work.  So, let's restore your fix and see what happens.  I wonder if it breaks procedure values (code address + static chain records)? 



On Apr 29, 2008, at 5:57 PM, Jay wrote:


Tony, this is a serious problem on AMD64_LINUX.It is not a problem at all on Win32, as Win32 has amuch better codegen model. It's amazing how Linux works..Look at the .ms file for ThreadPThread.I looked on AMD64_LINUX and LINUXLIBC6.ThreadPThread__InitMutex's call to its own finallyblock goes through the PLT and on AMD64_LINUX the static linkin r10 is trashed.It's possible that if you turn on optimizations, the finallyblock is inlined and that hides the problem, but you can'tcount on that.I was experimenting with another fix at the same time,that of using -fvisibility=hidden on m3cg, butto me that seems more like a C/C++ front end switch,even though cm3cg supports it.I can try again and carefully tweak the two variables,see if -fvisibility=hidden suffices. At the levelcm3cg operates though, it marks the visibility of everythingexplicitly, so again, I think my fix is the way.As well calls within a file to functions within that filethat aren't in an interface are going through the PLT.This is just wasteful.They shouldn't even go through the PLT for calls within thesame "library" (ie: m3core to m3core, libm3 to libm3).What such indirect calls "buy" is that, e.g. the .exe or libm3can replace functions in m3core, or such, and function pointerequality might be achieved. I think the "interposition" featureis widely accepted on Linux, though it is dodgy.I think on Linux going through the PLT for exported functions mightbe the norm. I'll have to read up more. But going through the PLTfor unexported functions is not the norm. Documentation stronglyencourages marking visibility and saving the PLT indirection.In C/C++ there's further problems of name uniquess of unexportedfunctions across the dynamic link. I believe Modula-3 deals with that,since pretty much every function in the system gets a unique name,exported or not. One or the other or both these changes (public = exported,or -fvisibilit=hidden) optimizes those calls.In general going through the PLT is very wasteful whenit isn't necessary. There's a bunch of "literature" aboutthis on the web.On Windows, to call a function Foo, you just call Foo.If Foo ends up imported, the linker generates a single instructionfunction for you, Foo, that jumps through __imp__Foo.If you are absolutely sure Foo will be imported and want tooptimize a little, you can mark Foo as __declspec(dllimport),however for functions this is totally optional.To export functions, you either mark them __declspec(dllexport)or list them in a .def file. For C++, .def files are a pain, butfor C they work just fine, or better.For importing data, you pretty much have to mark it as __declspec(dllimport).Importing data is rare.gcc/ld on Windows have some hack to make this easier that I'm not familiar with.So in the absence of importing data, there is just one codegenmodel that is acceptable -- call Foo.Most function calls, theoretically, are not imported, and thisends up as a normal direct call.There may be issues of position-independence, but on AMD64 thisis not relevant. On AMD64_NT, I believe the vast majority ofcode is naturally position-indendent via RIP-relative addressing.It is true that things like vtables might have relocs.I think that is unfortunate. It would be nice to have 100%position independence for .dlls and .exes. On Linux, if you are compiling for a .dll, you must be position-independent,I think fully, and all function calls by default go through the PLT.Maybe to statics don't. But just sharing across two source files does.Every call is therefore indirect, subject to loader machinations ateither load or first-call time, and "interposable" -- someone elsecan export a function of the same name and take over the function.As well, someone else can call these internal functions more easilythan otherwise. Granted, anyone can call any of your code at any time, justby jumping to it. But symbolic exports are considered more attackablesurface area than random code sitting in memory.If you don't use -fPIC, I think all calls are direct.And you can't link into a .dll.And then, really, the truth is in between.Individual calls can be marked one way or the other.But Modula-3 is marking everything as public, exported, subjectto dynamic linking, called through the PLT.As to why only AMD64_LINUX is seeing this, I don't know.I'd have to check how the static link is passed on others andif the loader preserves it. Could be it is an extra parameteron the stack, since x86 has so few registers.Could be AMD64_LINUX could/should pass it another way, butreally, avoiding the PLT for unexported functions seems likepure goodness.I was quite surprised and dismayed to learn about all this lastnight when I was debugging.Why must inline function bodies for unexported functions be preservedanyway? They are just dead code, right? Is there another way to preserve them?If it is <*inline*> on the implementation but listed in the *.i3 file, that should be public/exported. Is it not? I was able to build LINUXLIBC6 this way as far as building on AMD64 gets, which is pretty far -- eventually failing for lack of some X .libs.Oh, I guess I should be sure optimization is on? I didn't twiddle that. I can try again.  - Jay


From: hosking at cs.purdue.eduTo: jkrell at elegosoft.comDate: Tue, 29 Apr 2008 11:52:24 -0400CC: m3devel at elegosoft.comSubject: [M3devel] Your recent change to parse.c
I don't understand your change to parse.c re TREE_PUBLIC being set on procedure declarations.  TREE_PUBLIC just means that it is possible to call the procedure from outside the current compilation unit.  It has nothing to do with intra-library visibility.



Antony Hosking | Associate Professor | Computer Science | Purdue University
305 N. University Street | West Lafayette | IN 47907 | USA
Office +1 765 494 6001 | Mobile +1 765 427 5484
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20080430/e26b990e/attachment-0002.html>


More information about the M3devel mailing list