[M3devel] further reducing cloned headers wrt pthread?

Wed Feb 4 02:53:54 CET 2009

I am very leery of this proposal -- the code will be inherently opaque  
and unmaintainable.  I don't see any advantage to it.

On 4 Feb 2009, at 11:06, Jay wrote:

>
> There are a few possibilities:
>
>
> Roughly:
>
> Where there is
>
> INTERFACE Upthread;
>
>  TYPE
>   pthread_t = ... system specific ...
>   pthread_cond_t = ... system specific ...
>   pthread_mutex_t = ... system specific ...
>
>  PROCEDURE pthread_thread_init_or_whatever(VAR pthread_t);
>  PROCEDURE pthread_mutex_init_or_whatever(VAR pthread_mutex_t);
>  PROCEDURE pthread_cond_init_or_whatever(VAR pthread_cond_t);
>
>  MODULE PThread;
>  VAR
>    a: pthread_t;
>    b: pthread_cond_t;
>    c: pthread_mutex_t;
>
>  PROCEDURE Foo() =
>  BEGIN
>    Upthread.pthread_thread_init_or_whatever(a);
>    Upthread.pthread_cond_init_or_whatever(b);
>    Upthread.pthread_mutex_init_or_whatever(c);
>  END Foo;
>
>  change to:
>
>  INTERFACE Upthread;
>
>  TYPE
>   pthread_t = RECORD END;   or whatever is correct for an opaque  
> preferably unique type
>   pthread_cond_t = RECORD END;  ditto
>   pthread_mutex_t = RECORD END;  ditto
>
>  PROCEDURE pthread_thread_init_or_whatever(VAR pthread_t);
>  PROCEDURE pthread_mutex_init_or_whatever(VAR pthread_mutex_t);
>  PROCEDURE pthread_cond_init_or_whatever(VAR pthread_cond_t);
>
>
>  INTERFACE PThreadC.i3
>
>  PROCEDURE GetA(): UNTRACED REF Upthread.thread_t;
>  PROCEDURE GetB(): UNTRACED REF Upthread.thread_cond_t;
>  PROCEDURE GetC(): UNTRACED REF Upthread.thread_mutex_t;
>
> or possibly extern VAR
>
>  PThreadC.c
>
>  static pthread_t a = PTHREAD_INIT;
>  static pthread_cond_t b = PTHREAD_COND_INIT;
>  static pthread_mutex_t c = PTHREAD_MUTEX_INIT;
>
>  pthread_t* GetA() { return &a; }
>
>  pthread_cond_t* GetB() { return &b; }
>
>  pthread_mutex_t* GetC() { return &c; }
>
>  MODULE PThread;
>  VAR
>    a := PThreadC.GetA();
>    b := PThreadC.GetB();
>    c := PThreadC.GetA();
>
>  PROCEDURE Foo() =
>  BEGIN
>    Upthread.pthread_thread_init_or_whatever(a^);
>    Upthread.pthread_cond_init_or_whatever(b^);
>    Upthread.pthread_mutex_init_or_whatever(c^);
>  END Foo;
>
>  or, again, possibly they are variables and it goes a little smaller/ 
> quicker:
>
>  FROM UPthreadC IMPORT a, b, c;
>
>
>  PROCEDURE Foo() =
>  BEGIN
>    Upthread.pthread_thread_init_or_whatever(a);
>    Upthread.pthread_cond_init_or_whatever(b);
>    Upthread.pthread_mutex_init_or_whatever(c);
>  END Foo;
>
>  I think that is pretty cut and dry, no controversy.
>
>  What is less clear is what to do with non-statically allocated  
> variables.
>
>  Let's say:
>
>  MODULE PThread;
>
>  TYPE T = RECORD
>    a:int;
>    b:pthread_t;
>  END;
>
>  PROCEDURE CreateT():T=
>  VAR
>    t := NEW(T)
>  BEGIN
>    Upthread.init_or_whatever(t.b);
>    RETURN t;
>  END;
>
>  PROCEDURE DisposeT(t:T)=
>  BEGIN
>    IF t = NIL THEN RETURN END;
>    Upthread.pthread_cleanup_or_whatever(t.b);
>    DISPOSE(t);
>  END;
>
>  The desire is something that does not know the size of pthread_t,  
> something like:
>
>  TYPE T = RECORD
>    a:int;
>    b:UNTRACED REF pthread_t;
>  END;
>
>
>  PROCEDURE CreateT():T=
>  VAR
>    t := NEW(T);
>  BEGIN
>    t.b := LOOPHOLE(UNTRACED REF pthread_t, NEW(UNTRACED REF ARRAY OF  
> CHAR, Upthread.pthread_t_size));
>    (* Though I really wanted t.b :=  
> RTAllocator.MallocZeroed(Upthread.pthread_t_size); *)
>    Upthread.init_or_whatever(t.b^);
>    RETURN t;
>  END;
>
>  PROCEDURE DisposeT(t:T)=
>  BEGIN
>    IF t = NIL THEN RETURN END;
>    Upthread.pthread_cleanup_or_whatever(t.b^);
>    DISPOSE(t.b);
>    DISPOSE(t);
>  END;
>
>
>  However that incurs an extra heap allocation, which is not great.
>  In at least one place, the pointer-indirection-and-heap-allocation  
> is already there
>  so this isn't a deoptimization. However "reoptimizing" it might be  
> nice.
>
>
>   What I would prefer a pattern I often use in C -- merging  
> allocations, something like,
>   /assuming/ t is untraced, which I grant it might not be.
>
>
>   And ensuring that BYTESIZE(T) is properly aligned:
>
>
>   PROCEDURE CreateT():UNTRACED REF T=
>   VAR
>     p : ADDRESS;
>     t : UNTRACED REF T;
>   BEGIN
>     (* Again I would prefer RTAllocator.MallocZeroed *)
>     p := NEW(UNTRACED REF ARRAY OF CHAR, Upthread.pthread_t_size +  
> BYTESIZE(T)));
>     t := LOOPHOLE(UNTRACED REF T, p);
>     t.b := LOOPHOLE(UNTRACED REF Upthread.pthread_t, p + BYTESIZE(T));
>     Upthread.init_or_whatever(t.b^);
>     RETURN t;
>   END;
>
>
>  That is -- opaque types, size not known at compile-time, but size  
> known at runtime, and
>  do not incur an extra heap allocation for lack of knowing sizes at  
> compile-time.
>
>
>  For the statically allocated variables I think there is no  
> controversy.
>    There might a tiny bit of overhead in the use, but it'd be very  
> small, and possibly
>    even removable in the future. I'd rather avoid the variables, as  
> all writable
>    data is to be avoided. Read only pages are better and all that,  
> but ok..
>
>
>  However the value is mainly realized only if statically and  
> dynamically allocated variables are handled.
>
>  The result of this would be further reduction in platform- 
> specificity when cloning
>  C headers into Modula-3 interfaces. i.e. less work to bring up new  
> platforms.
>
>
>  - Jay
>
>
> ----------------------------------------
>> From: hosking at cs.purdue.edu
>> To: jay.krell at cornell.edu
>> Date: Wed, 4 Feb 2009 09:54:01 +1100
>> CC: m3devel at elegosoft.com
>> Subject: Re: [M3devel] further reducing cloned headers wrt pthread?
>>
>> I suggest you come up with a proposal for us to look over before you
>> change the code base for this.
>>
>> On 4 Feb 2009, at 09:05, Jay wrote:
>>
>>>
>>>> Hmm, yes, you are right that there is a possible alignment issue. I
>>>> am used to pthread_mutext_t being a simple reference. But surely  
>>>> in C
>>>> the type of the pthread_mutex_t struct would have appropriate
>>>> alignment padding anyway so as to allow allocation using
>>>> malloc(sizeof
>>>> pthread_mutex_t)? So, it all should just work right?
>>>
>>>
>>> I think "the other way around" and same conclusion.
>>> malloc should return something "maximally aligned" so that
>>>
>>> pthread_mutex_t* x = (pthread_mutex_t*)
>>> malloc(sizeof(pthread_mutex_t));
>>>
>>>
>>> works. pthread_mutex_t doesn't need the padding, malloc does, so to
>>> speak.
>>>
>>>
>>> Just as long as we don't have
>>>
>>>
>>> TYPE Foo = RECORD
>>> a: pthread_mutex_t;
>>> b: pthread_mutex_t;
>>> c: pthread_t;
>>> d: pthread_t;
>>> e: pthread_cond_t;
>>> f: pthread_cond_t;
>>> END;
>>>
>>>
>>> and such, ok.
>>>
>>>
>>> malloc on NT returns something with 2 * sizeof(void*) alignment.
>>> I think on Win9x only 4 alignment, thus there is _malloc_aligned for
>>> dealing with SSE stuff.
>>> Something like that.
>>>
>>>
>>> I didn't realize untraced allocations were basically just malloc but
>>> indeed they are.
>>>
>>>
>>> I'm still mulling over the possible deoptimizations here.
>>> I'm reluctant to increase heap allocations.
>>>
>>>
>>>
>>> - Jay
>>