[M3devel] TEXT & etc.
Rodney M. Bates
rodney.bates at wichita.edu
Thu Jan 1 20:02:17 CET 2009
I started working on a more elaborate package of performance improvements
to TEXT operations. They are algorithm-only (no changes to declarations or
invariants of data structure), thus no effect on pickles, stubs, etc., as
is your change.
I am flattening concatenations only when the result has finite bounded
length, keeping Text.Cat O(1), although with considerably larger constant
factor. I am also doing two-level rebalancing of trees, also O(1).
Without knowing depths (which would require either an O(log N) algorithm
to compute then or caching them, requiring data structure changes), I am
using lengths as a crude estimate of depth.
Meanwhile I am recoding get_chars/get_wide_chars to recurse on only
the shorter side and iterate on the other. This, in combination with
better balance, should cut down greatly on stack depth and execution
time.
I am also avoiding cases where a number of other operations in the current
implementation unnecessarily convert concatenations containing only 8-bit
representations to 16, which requires a character-by-character loop
to expand them.
I also have found a couple of bugs.
Hopefully, these will help.
Mika Nystrom wrote:
> Hello everyone,
>
> I mentioned a week or two ago that I had run into horrific performance
> issues with CM3's TEXTs. It wasn't my intention to start a debate
> about Unicode and Mahjong characters, just to point out that there's
> a serious performance problem with CM3 in this area. I run into
> performance problems with CM3 from time to time because I am trying
> to maintain a fairly large amount of software written in Modula-3
> and it appears to me that the SRC/PM3 compilers often pay better
> attention to performance issues than the newer CM3 compiler.
> Generally speaking I think Modula-3 needs to avoid getting into the
> Java situation: Java is squeezed between "fast enough" Python and
> "much more efficient" C++... Modula-3 I see as the only hope for
> getting the performance of C++ with the safety of Java. The problems
> I have seen with CM3 are in the following areas:
>
> 1. Thread mutex acquisition involving kernel calls (pthreads
> implementation)
>
> 2. ISTYPE and TYPECASE (appear much faster on PM3, don't know why,
> could just be a profiling issue). In any case TYPECASE is very slow
> because of having to search the type tree. This could be improved
> but I can't work on it because I always run into some kind of fatal
> problem trying to bootstrap the compiler.
>
> 3. The TEXT issue. I mentioned performance problems before. It turns
> out that it was showing up in another area. I upgraded my PowerBook
> a while back (to OS X 10.4) and of course everything stopped working,
> including at least half my Modula-3 programs. I thought it was a
> versioning problem with C libraries I was importing, but it turned out
> that in at least one case I was running out of stack space because of
> the calls to get_char on concatenated TEXTs. (Not very long ones,
> at that.)
>
> To get to the point. What follows is my new version of TextCat.m3,
> which I believe works, but I can't really say because I'm perennially
> incapable of bootstrapping CM3. The basic idea is that you can
> have *one* TextCat.T, at the "root" of your TEXT. It will flatten
> anything else.
>
> I don't see a simple way of modifying the CM3 TEXT implementation to
> do much better than this. Of course throwing out TextCat.T completely
> is an option.
>
> An advantage of what I've done is that I haven't changed any interfaces
> at all. So I don't get problems with version stamps, etc.
>
> Mika
>
>
More information about the M3devel
mailing list