[M3devel] Performance issues with CM3
Tony Hosking
hosking at cs.purdue.edu
Mon Apr 27 02:26:06 CEST 2009
>>> Tony, all right, but the code is still an abomination! The program
> only gets loaded once (even if dynamically). The TYPECASE might
> be executed millions, billions, trillions of times... if you
> execute this code a billion times, those IF statements are only
> "effective" once and just return the same value 999,999,999 times.
> Surely this information can be pre-computed at load time and stored
> somewhere so the IFs don't have to be re-interpreted?
I haven't looked closely at this code but it's been there since the
Critical Mass days. The dynamic loading I refer to is for explicit
opening and linking of .so files -- not the usual dynamic linking that
takes place with .so files resolved at link time. Anyway, perhaps the
code can be cleaned up.
> I don't know what the best solution is here. Is it really impossible
> to construct and re-construct the data structures that PM3 uses,
> or something equivalent?
I don't know. Possibly. Again, I don't know enough about this corner
of the world to answer all of the following questions:
> For instance, Cardelli's method of keeping the supertypes of a given
> type in an array would work fine with dynamic loading. It's almost
> as fast as the PM3 method but takes a bit more memory. One could
> limit these arrays to a certain size and then skip to the next
> array, to make a hybrid between what we have now in CM3 and a full
> array for every type.
>
> Do you think my hash table approach is safe? It's not nearly as
> elegant as the Cardelli method, but it works really well for a quick
> and dirty. It's ugly, but it's very fast. Or can the arrays used
> by the TYPECASEs move during program execution?
>
> In any case, with the standard CM3 implementation, my program spends
> more time in TYPECASE than the whole thing does under PM3.
>
> Do we have any code in CM3 that loads new types dynamically? I can
> see loading code dynamically, but types...? I'm very curious, is
> there an interface for it that "mortals" can use? If we have the
> mechanism, might as well put it to work...
AFAIK you need to invoke RTLinker.AddUnit after using the native
dlopen/dlsym facilities to get the symbol corresponding to the unit
being loaded. I've never tried using it but I did have some plans to
do so back in my persistence days (when retrieving an instance of an
unknown subtype we'd need to also dynamically load its defining module).
> Another thing that bothers me a bit is that PushEFrame (sp?) seems to
> take a lot of cycles. Is the use of this routine due to the absence
> of a "stack walker"? I thought FreeBSD had a working stack walker
> in the Modula-3 runtime...?
The stack walker is not usable for exception handling, only for
printing a stack dump (IIRC). Same was true for PM3. We should do
something about using the gcc-backend native stack walking support.
Most of the overhead for PushEFrame (as compared to user-threaded PM3)
is obtaining the thread-local stack. For user-threaded PM3 it was a
compiled in global variable (which obviously breaks for native threads).
> The thing about PushEFrame is that it makes the programmer want to
> avoid procedure calls, which is a real pity. Sussman and Steele's
> railings against the inefficient procedure calls in PL/I circa 1975
> are legendary. (It's true, procedural abstraction is really
> wonderful.)
I agree. I've maintained the SOLgnu stack walking code for years so
that I don't have to pay the overhead of the explicit exception stack.
>
>
> Mika
>
> P.S. My hash table code re-attached in case someone wants to look at
> it
> again (instrumenting code commented out):
>
> TYPE
> TypeCaseResult = RECORD
> x : ADDRESS;
> tc : Typecode;
> arm : INTEGER;
> END;
>
> CONST
> TCCachePow = 6;
> TCCacheSize = Word.Shift(1,TCCachePow);
> TCMask = TCCacheSize-1;
>
> VAR TCCache := ARRAY [0..TCCacheSize-1] OF TypeCaseResult {
> TypeCaseResult { LOOPHOLE(0,ADDRESS), 0, -1 } ,
> ..
> };
>
> (*
> VAR tcScans := 0; tcHits := 0;
> *)
>
> PROCEDURE ScanTypecase (ref: REFANY;
> x: ADDRESS(*ARRAY [0..] OF Cell*)): INTEGER =
> VAR p: UNTRACED REF TypecaseCell; i: INTEGER; tc, xc: Typecode;
> BEGIN
> tc := TYPECODE (ref);
> IF (ref = NIL) THEN RETURN 0; END;
>
> WITH hash = Word.And(Word.Times(tc,
>
> Word.RightShift(LOOPHOLE(x,Word.T),2)),
> TCMask),
> entry = TCCache[hash] DO
> (*INC(tcScans);*)
> IF entry.x = x AND entry.tc = tc THEN
> (*INC(tcHits);*)
> RETURN entry.arm
> END;
>
> p := x; i := 0;
> LOOP
> IF (p.uid = 0) THEN entry.x := x; entry.tc := tc;
> entry.arm := i; RETURN i; END;
> IF (p.defn = NIL) THEN
> p.defn := FindType (p.uid);
> IF (p.defn = NIL) THEN
> Fail (RTE.MissingType, RTModule.FromDataAddress(x),
> LOOPHOLE (p.uid, ADDRESS), NIL);
> END;
> END;
> xc := LOOPHOLE (p.defn, RT0.TypeDefn).typecode;
> IF (tc = xc) OR IsSubtype (tc, xc) THEN entry.x := x;
> entry.tc := tc; entry.arm := i; RETURN i; END;
> INC (p, ADRSIZE (p^)); INC (i);
> END;
> END;
> END ScanTypecase;
>
More information about the M3devel
mailing list