[M3devel] TEXT & etc.

Rodney M. Bates rodney.bates at wichita.edu
Fri Jan 2 18:29:26 CET 2009


No, I have not committed anything yet.  I have it temporarily coded so the
the operations can be switched between the old and new versions by setting a
(gasp!) global variable.  I plan to write a test program that generates many
(10s of thousands) of random operations on texts, does them using old and
new algorithms, and compares results.  I have used this testing approach on
several projects in the past, with very good success.

My preferred development style is to get a group of related changes complete
and tested before committing anything.  For one thing, that saves me from
having to retest after every little change.

Just since my previous post, I have already had to do some rethinking about
how much and what kind of rebalancing to do.

Olaf Wagner wrote:
> Quoting "Rodney M. Bates" <rodney.bates at wichita.edu>:
> 
>> I started working on a more elaborate package of performance improvements
>> to TEXT operations.  They are algorithm-only (no changes to 
>> declarations or
>> invariants of data structure), thus no effect on pickles, stubs, etc., as
>> is your change.
>>
>> I am flattening concatenations only when the result has finite bounded
>> length, keeping Text.Cat O(1), although with considerably larger constant
>> factor.  I am also doing two-level rebalancing of trees, also O(1).
>> Without knowing depths (which would require either an O(log N) algorithm
>> to compute then or caching them, requiring data structure changes), I am
>> using lengths as a crude estimate of depth.
>>
>> Meanwhile I am recoding get_chars/get_wide_chars to recurse on only
>> the shorter side and iterate on the other.  This, in combination with
>> better balance, should cut down greatly on stack depth and execution
>> time.
>>
>> I am also avoiding cases where a number of other operations in the 
>> current
>> implementation unnecessarily convert concatenations containing only 8-bit
>> representations to 16, which requires a character-by-character loop
>> to expand them.
>>
>> I also have found a couple of bugs.
> 
> All this sounds good. Have you already committed some of your work?
> (I'm not following m3commit closely these days.)
> Have you also some test cases that will validate the TEXT behaviour
> for concatenations etc.?
> 
> Olaf
> 
>



More information about the M3devel mailing list