[M3devel] further reducing cloned headers wrt pthread?

Wed Feb 4 01:06:24 CET 2009

 There are a few possibilities: 

 Roughly: 

 Where there is 

 INTERFACE Upthread; 

  TYPE  
   pthread_t = ... system specific ...  
   pthread_cond_t = ... system specific ...  
   pthread_mutex_t = ... system specific ...  

  PROCEDURE pthread_thread_init_or_whatever(VAR pthread_t);  
  PROCEDURE pthread_mutex_init_or_whatever(VAR pthread_mutex_t);  
  PROCEDURE pthread_cond_init_or_whatever(VAR pthread_cond_t);  

  MODULE PThread;  
  VAR 
    a: pthread_t; 
    b: pthread_cond_t; 
    c: pthread_mutex_t;  

  PROCEDURE Foo() =
  BEGIN
    Upthread.pthread_thread_init_or_whatever(a);  
    Upthread.pthread_cond_init_or_whatever(b);  
    Upthread.pthread_mutex_init_or_whatever(c);  
  END Foo;  

  change to:  

  INTERFACE Upthread;  

  TYPE  
   pthread_t = RECORD END;   or whatever is correct for an opaque preferably unique type 
   pthread_cond_t = RECORD END;  ditto 
   pthread_mutex_t = RECORD END;  ditto 

  PROCEDURE pthread_thread_init_or_whatever(VAR pthread_t);  
  PROCEDURE pthread_mutex_init_or_whatever(VAR pthread_mutex_t);  
  PROCEDURE pthread_cond_init_or_whatever(VAR pthread_cond_t);  

  INTERFACE PThreadC.i3  

  PROCEDURE GetA(): UNTRACED REF Upthread.thread_t;  
  PROCEDURE GetB(): UNTRACED REF Upthread.thread_cond_t;  
  PROCEDURE GetC(): UNTRACED REF Upthread.thread_mutex_t;  

 or possibly extern VAR 

  PThreadC.c  

  static pthread_t a = PTHREAD_INIT;  
  static pthread_cond_t b = PTHREAD_COND_INIT;  
  static pthread_mutex_t c = PTHREAD_MUTEX_INIT;  

  pthread_t* GetA() { return &a; }  

  pthread_cond_t* GetB() { return &b; }  

  pthread_mutex_t* GetC() { return &c; }   

  MODULE PThread; 
  VAR  
    a := PThreadC.GetA();  
    b := PThreadC.GetB();  
    c := PThreadC.GetA();   

  PROCEDURE Foo() =  
  BEGIN  
    Upthread.pthread_thread_init_or_whatever(a^);  
    Upthread.pthread_cond_init_or_whatever(b^);  
    Upthread.pthread_mutex_init_or_whatever(c^);  
  END Foo;  

  or, again, possibly they are variables and it goes a little smaller/quicker:  

  FROM UPthreadC IMPORT a, b, c;  

  PROCEDURE Foo() =  
  BEGIN  
    Upthread.pthread_thread_init_or_whatever(a);  
    Upthread.pthread_cond_init_or_whatever(b);  
    Upthread.pthread_mutex_init_or_whatever(c);  
  END Foo;  

  I think that is pretty cut and dry, no controversy. 

  What is less clear is what to do with non-statically allocated variables.  

  Let's say:  

  MODULE PThread;  

  TYPE T = RECORD  
    a:int; 
    b:pthread_t; 
  END;  

  PROCEDURE CreateT():T=  
  VAR  
    t := NEW(T)  
  BEGIN  
    Upthread.init_or_whatever(t.b);  
    RETURN t;  
  END;  

  PROCEDURE DisposeT(t:T)=  
  BEGIN  
    IF t = NIL THEN RETURN END;  
    Upthread.pthread_cleanup_or_whatever(t.b);  
    DISPOSE(t);  
  END;  

  The desire is something that does not know the size of pthread_t, something like:  

  TYPE T = RECORD  
    a:int;
    b:UNTRACED REF pthread_t;  
  END;  

  PROCEDURE CreateT():T=  
  VAR  
    t := NEW(T);  
  BEGIN  
    t.b := LOOPHOLE(UNTRACED REF pthread_t, NEW(UNTRACED REF ARRAY OF CHAR, Upthread.pthread_t_size));  
    (* Though I really wanted t.b := RTAllocator.MallocZeroed(Upthread.pthread_t_size); *)  
    Upthread.init_or_whatever(t.b^);  
    RETURN t;  
  END;  

  PROCEDURE DisposeT(t:T)=  
  BEGIN  
    IF t = NIL THEN RETURN END;  
    Upthread.pthread_cleanup_or_whatever(t.b^);  
    DISPOSE(t.b);  
    DISPOSE(t);  
  END;  

  However that incurs an extra heap allocation, which is not great.   
  In at least one place, the pointer-indirection-and-heap-allocation is already there  
  so this isn't a deoptimization. However "reoptimizing" it might be nice. 

   What I would prefer a pattern I often use in C -- merging allocations, something like,  
   /assuming/ t is untraced, which I grant it might not be.   

   And ensuring that BYTESIZE(T) is properly aligned:   

   PROCEDURE CreateT():UNTRACED REF T=  
   VAR
     p : ADDRESS;  
     t : UNTRACED REF T;  
   BEGIN  
     (* Again I would prefer RTAllocator.MallocZeroed *)   
     p := NEW(UNTRACED REF ARRAY OF CHAR, Upthread.pthread_t_size + BYTESIZE(T)));  
     t := LOOPHOLE(UNTRACED REF T, p);   
     t.b := LOOPHOLE(UNTRACED REF Upthread.pthread_t, p + BYTESIZE(T));  
     Upthread.init_or_whatever(t.b^);  
     RETURN t;  
   END;  

  That is -- opaque types, size not known at compile-time, but size known at runtime, and  
  do not incur an extra heap allocation for lack of knowing sizes at compile-time.  

  For the statically allocated variables I think there is no controversy.   
    There might a tiny bit of overhead in the use, but it'd be very small, and possibly  
    even removable in the future. I'd rather avoid the variables, as all writable
    data is to be avoided. Read only pages are better and all that, but ok..

  However the value is mainly realized only if statically and dynamically allocated variables are handled.   

  The result of this would be further reduction in platform-specificity when cloning  
  C headers into Modula-3 interfaces. i.e. less work to bring up new platforms.

  - Jay  

----------------------------------------
> From: hosking at cs.purdue.edu
> To: jay.krell at cornell.edu
> Date: Wed, 4 Feb 2009 09:54:01 +1100
> CC: m3devel at elegosoft.com
> Subject: Re: [M3devel] further reducing cloned headers wrt pthread?
>
> I suggest you come up with a proposal for us to look over before you
> change the code base for this.
>
> On 4 Feb 2009, at 09:05, Jay wrote:
>
>>
>>> Hmm, yes, you are right that there is a possible alignment issue. I
>>> am used to pthread_mutext_t being a simple reference. But surely in C
>>> the type of the pthread_mutex_t struct would have appropriate
>>> alignment padding anyway so as to allow allocation using
>>> malloc(sizeof
>>> pthread_mutex_t)? So, it all should just work right?
>>
>>
>> I think "the other way around" and same conclusion.
>> malloc should return something "maximally aligned" so that
>>
>> pthread_mutex_t* x = (pthread_mutex_t*)
>> malloc(sizeof(pthread_mutex_t));
>>
>>
>> works. pthread_mutex_t doesn't need the padding, malloc does, so to
>> speak.
>>
>>
>> Just as long as we don't have
>>
>>
>> TYPE Foo = RECORD
>> a: pthread_mutex_t;
>> b: pthread_mutex_t;
>> c: pthread_t;
>> d: pthread_t;
>> e: pthread_cond_t;
>> f: pthread_cond_t;
>> END;
>>
>>
>> and such, ok.
>>
>>
>> malloc on NT returns something with 2 * sizeof(void*) alignment.
>> I think on Win9x only 4 alignment, thus there is _malloc_aligned for
>> dealing with SSE stuff.
>> Something like that.
>>
>>
>> I didn't realize untraced allocations were basically just malloc but
>> indeed they are.
>>
>>
>> I'm still mulling over the possible deoptimizations here.
>> I'm reluctant to increase heap allocations.
>>
>>
>>
>> - Jay
>