[M3devel] gcc 4.5.0?
Jay K
jay.krell at cornell.edu
Fri May 21 06:47:54 CEST 2010
Maybe we can/should transform into one of the forms I show and dispense with any gcc support?
At the cost of losing some "easy" optimization..of nested functions?
"easy" as in, occurs even without -O2, but might occur with -O2?
I know what you mean -- NT386 passes the static link in ecx.
This is "ok" because in C++ the "this" pointer is passed in ecx.
Passing it as "just" another parameter might be less efficient, but ok?
And really, it'd only be less efficient on one architecture -- 32bit x86.
And maybe not all x86s -- depending on if some ABIs have a register based calling convention.
We don't need any actual ABI-compatibility with anything here, do we?
- Jay
----------------------------------------
> From: hosking at cs.purdue.edu
> Date: Thu, 20 May 2010 22:22:46 -0400
> To: jay.krell at cornell.edu
> CC: m3devel at elegosoft.com
> Subject: Re: [M3devel] gcc 4.5.0?
>
> That is the standard formulation, but different architectures have different optimised forms (passing static link in a register, etc.). In any case, gcc has its own weird way of doing things, which we mostly use, but from which we need a little extra support (hence the small amount of special tailoring).
>
> On 20 May 2010, at 21:43, Jay K wrote:
>
>>
>> Wierd. To me the model to follow is almost obvious. It should be obvious to everyone and implemented the way I have in mind. :)
>>
>>
>> The static link is an extra parameter.
>> And, as such, also an extra local variable.
>> You only need one.
>> Each nesting level would follow another chain.
>> Or you can pass each nesting level as an extra parameter.
>> Or you can optimize and pass locals/parameters separately.
>> But there is an obvious straightforward non-optimized form. ?
>>
>>
>> Here, I worked it through:
>>
>>
>>
>> Why doesn't it just look something like this:
>>
>>
>> nested:
>>
>> void F1(int a, int b)
>> {
>> int c = a + b;
>>
>> int f2()
>> {
>> int d = a + b;
>>
>> int f3()
>> {
>> return d + a + c;
>> }
>>
>> return a + c + f3();
>> }
>>
>> return f2();
>> }
>>
>> not nested:
>>
>>
>> typedef struct {
>> /* locals */
>> int d;
>> void* x86_frame_pointer;
>> void* x86_return_address;
>> /* parameters */
>> f1_frame_t* f1;
>> } f2_frame_t;
>>
>>
>> typedef struct {
>> /* locals */
>> int c;
>> void* x86_frame_pointer;
>> void* x86_return_address;
>> /* parameters */
>> int a;
>> int b;
>> } f1_frame_t;
>>
>>
>> int f3(f2_frame_t* f2)
>> {
>> return f2->d + f2->f1->a + f2->f1->c;
>> }
>>
>>
>> int f2(f1_frame_t* f1)
>> {
>> int d = f1->a + f1->b;
>> return f1->a + f1->c + f3((f2_frame_t*)&d);
>> }
>>
>>
>> int F1(int a, int b)
>> {
>> int c = a + b;
>> return f2((f1_frame_t*)&c);
>> }
>>
>>
>> or, alernatively, almost the same, pass each chain as another parameter:
>>
>>
>> typedef struct {
>> /* locals */
>> int d;
>> void* x86_frame_pointer;
>> void* x86_return_address;
>> /* parameters */
>> } f2_frame_t;
>>
>>
>> typedef struct {
>> /* locals */
>> int c;
>> void* x86_frame_pointer;
>> void* x86_return_address;
>> /* parameters */
>> int a;
>> int b;
>> } f1_frame_t;
>>
>>
>> int f3(f2_frame_t* f2, f1_frame_t* f1)
>> {
>> return f2->d + f1->a + f1->c;
>> }
>>
>>
>>
>> int f2(f1_frame_t* f1)
>> {
>> int d = f1->a + f1->b;
>> return f1->a + f1->c + f3((f2_frame_t*)&d, f1);
>> }
>>
>>
>> int F1(int a, int b)
>> {
>> int c = a + b;
>> return f2((f1_frame_t*)&c);
>> }
>>
>>
>> This should be easily adapted to any function call convention/sequence.
>> e.g. even if parameters are passed in registers.
>>
>>
>> And for that matter, why don't we just transform it to be like this?
>> With the goal of reducing/eliminating gcc patches?
>>
>>
>> Am I missing something?
>>
>>
>> This form doesn't work?
>>
>>
>> Isn't reasonably simple?
>>
>>
>> Granted it might not be optimal. But it depends.
>> You know, you can copy in the local/parameter values, but then you have
>> to copy them out too. You can do analysis to show they aren't changed,
>> in which case no need to copy them out.
>> Similarly you can pass the parameters as separate parameters, by address
>> if they are ever changed.
>>
>>
>> A portable transform couldn't depend on adjacency of locals and parameters.
>> It would have to either copy the parameters into the frame_t, or their addresses.
>>
>>
>> A portable form couldn't get the frame by taking the address of a "individual" local.
>>
>>
>> Here is portable form:
>>
>>
>> typedef struct {
>> int d;
>> f1_frame_t* f1;
>> } f2_frame_t;
>>
>> typedef struct {
>> int a;
>> int b;
>> int c;
>> } f1_frame_t;
>>
>>
>> int f3(f2_frame_t* f2)
>> {
>> return f2->d + f2->f1->a + f2->f1->c;
>> }
>>
>> int f2(f1_frame_t* f1)
>> {
>> f2_frame_t f;
>> f.f1 = f1;
>> f.d = f1->a + f1->b;
>> return f1->a + f1->c + f3(&f);
>> }
>>
>>
>> int F1(int a, int b)
>> {
>> f1_frame_t f;
>> f.a = a;
>> f.b = b;
>> f.c = a + b;
>> return f2(&f);
>> }
>>
>>
>> or:
>>
>>
>> typedef struct {
>> int d;
>> } f2_frame_t;
>>
>>
>> typedef struct {
>> int a;
>> int b;
>> int c;
>> } f1_frame_t;
>>
>>
>> int f3(f2_frame_t* f2, f1_frame_t* f1)
>> {
>> return f2->d + f1->a + f1->c;
>> }
>>
>>
>> int f2(f1_frame_t* f1)
>> {
>> f2_frame_t f;
>> f.d = f1->a + f1->b;
>> return f1->a + f1->c + f3(&f, f1);
>> }
>>
>>
>> int F1(int a, int b)
>> {
>> f1_frame_t f;
>> f.a = a;
>> f.b = b;
>> f.c = a + b;
>> return f2(&f);
>> }
>>
>>
>>
>>
>> - Jay
>>
>>
>>
>> ----------------------------------------
>>> Date: Thu, 20 May 2010 12:12:00 -0500
>>> From: rodney_bates at lcwb.coop
>>> To: m3devel at elegosoft.com
>>> Subject: Re: [M3devel] gcc 4.5.0?
>>>
>>> gcc even extends C with nested functions, and the support is there for that. We use it too.
>>>
>>> Beware: The runtime model is, IMO, bizarre. Definitely counterintuitive and unlike anything
>>> you will ever see elsewhere. There are two static links. One is used for the first step
>>> in following a static chain and points to the expected place in the target frame. The other
>>> is used of subsequent steps in chain-following, and points to somewhere inside the target
>>> frame that is dependent on the target procedure. It's the beginning of a subrecord where
>>> all the nonlocally-referenced variables are collected. The second static link is also
>>> located in there. All the nonlocally-referenced variables and parameters are duplicated
>>> in there too.
>>>
>>> I'm not sure I remembered these details exactly right. Maybe both links point to the
>>> subrecord, but one is located at a traditional procedure-independent, fixed location
>>> in the frame, while the second is located in the subrecord. Also, sometimes the first
>>> static link value is kept in a register and never stored.
>>>
>>> What this all apparently accomplishes is simplifying an "insanely complicated" _compile-time_
>>> scheme that, I believe came from interaction between nested procedures and inlining them.
>>>
>>> But it's at the cost of what I would call an insanely complicated _runtime_ scheme.
>>>
>>> It undermined m3gdb's handling of static links, and I don't think it's possible to fix
>>> with the current design of emitting debug info very early. The locations and target
>>> offsets of these things aren't know until later, and m3gdb really needs to know them.
>>>
>>>
>>> Jay K wrote:
>>>> What I meant regarding "static chain" was more like "does gcc already have support that we can use".
>>>> Don't any of the other languages need it already?
>>>> Ada? Pascal? Maybe the nested C function support?
>>>>
>>>> - Jay
>>>>
>
More information about the M3devel
mailing list