[M3devel] TEXT & etc.

Olaf Wagner wagner at elegosoft.com
Fri Jan 2 11:42:48 CET 2009


Quoting "Rodney M. Bates" <rodney.bates at wichita.edu>:

> I started working on a more elaborate package of performance improvements
> to TEXT operations.  They are algorithm-only (no changes to declarations or
> invariants of data structure), thus no effect on pickles, stubs, etc., as
> is your change.
>
> I am flattening concatenations only when the result has finite bounded
> length, keeping Text.Cat O(1), although with considerably larger constant
> factor.  I am also doing two-level rebalancing of trees, also O(1).
> Without knowing depths (which would require either an O(log N) algorithm
> to compute then or caching them, requiring data structure changes), I am
> using lengths as a crude estimate of depth.
>
> Meanwhile I am recoding get_chars/get_wide_chars to recurse on only
> the shorter side and iterate on the other.  This, in combination with
> better balance, should cut down greatly on stack depth and execution
> time.
>
> I am also avoiding cases where a number of other operations in the current
> implementation unnecessarily convert concatenations containing only 8-bit
> representations to 16, which requires a character-by-character loop
> to expand them.
>
> I also have found a couple of bugs.

All this sounds good. Have you already committed some of your work?
(I'm not following m3commit closely these days.)
Have you also some test cases that will validate the TEXT behaviour
for concatenations etc.?

Olaf

> Hopefully, these will help.
>
> Mika Nystrom wrote:
>> Hello everyone,
>>
>> I mentioned a week or two ago that I had run into horrific performance
>> issues with CM3's TEXTs.  It wasn't my intention to start a debate
>> about Unicode and Mahjong characters, just to point out that there's
>> a serious performance problem with CM3 in this area.  I run into
>> performance problems with CM3 from time to time because I am trying
>> to maintain a fairly large amount of software written in Modula-3
>> and it appears to me that the SRC/PM3 compilers often pay better
>> attention to performance issues than the newer CM3 compiler.
>> Generally speaking I think Modula-3 needs to avoid getting into the
>> Java situation: Java is squeezed between "fast enough" Python and
>> "much more efficient" C++... Modula-3 I see as the only hope for
>> getting the performance of C++ with the safety of Java.  The problems
>> I have seen with CM3 are in the following areas:
>>
>> 1. Thread mutex acquisition involving kernel calls (pthreads
>>   implementation)
>>
>> 2. ISTYPE and TYPECASE (appear much faster on PM3, don't know why,
>>   could just be a profiling issue).  In any case TYPECASE is very slow
>>   because of having to search the type tree.  This could be improved
>>   but I can't work on it because I always run into some kind of fatal
>>   problem trying to bootstrap the compiler.
>>
>> 3. The TEXT issue.  I mentioned performance problems before.  It   
>> turns   out that it was showing up in another area.  I upgraded my   
>> PowerBook
>>   a while back (to OS X 10.4) and of course everything stopped working,
>>   including at least half my Modula-3 programs.  I thought it was a  
>>    versioning problem with C libraries I was importing, but it   
>> turned out
>>   that in at least one case I was running out of stack space because of
>>   the calls to get_char on concatenated TEXTs.  (Not very long ones,
>>   at that.)
>>
>> To get to the point.  What follows is my new version of TextCat.m3,
>> which I believe works, but I can't really say because I'm perennially
>> incapable of bootstrapping CM3.  The basic idea is that you can   
>> have *one* TextCat.T, at the "root" of your TEXT.  It will flatten
>> anything else.  I don't see a simple way of modifying the CM3 TEXT   
>> implementation to do much better than this.  Of course throwing out  
>>  TextCat.T completely
>> is an option.
>>
>> An advantage of what I've done is that I haven't changed any interfaces
>> at all.  So I don't get problems with version stamps, etc.
>>
>>     Mika
>>
>>



-- 
Olaf Wagner -- elego Software Solutions GmbH
                Gustav-Meyer-Allee 25 / Gebäude 12, 13355 Berlin, Germany
phone: +49 30 23 45 86 96  mobile: +49 177 2345 869  fax: +49 30 23 45 86 95
    http://www.elegosoft.com | Geschäftsführer: Olaf Wagner | Sitz: Berlin
Handelregister: Amtsgericht Charlottenburg HRB 77719 | USt-IdNr: DE163214194




More information about the M3devel mailing list