[M3announce] New TEXT algorithms checked in

Olaf Wagner wagner at elegosoft.com
Fri May 24 11:42:38 CEST 2013


On Thu, 23 May 2013 16:34:02 -0500
"Rodney M. Bates" <rodney_bates at lcwb.coop> wrote:

> I have checked in changes to the Text implementation in the head.
> These make no change to the cm3 data structure or its invariants, only
> different choices of alternate representations for a given abstract
> character string.  The major changes are in Cat, which, while O(1),
> created representations inefficient to access.  This was particularly
> bad (O(n)) after building a TEXT value by a linear series of concatenations,
> either left-to-right or right-to-left.  These were also very extravagant in
> heap space usage.
> 
> To get the changes built and running on your machine, just build and ship
> m3core, using cm3 or scripts/do-cm3-min.sh.
> 
> As you might expect with any purely functional style abstraction, there
> are gains and losses in space and time performance, depending greatly on
> the usage pattern.  Overall, there are mostly small to large net gains
> on balanced mixtures of creating and accessing strings, both in time
> and space.  When strings are built linearly, net speed gains around 4-to-1
> are common.
> 
> The new algorithms are extensively tested on LINUXLIBC6 and AMD64_LINUX.
> Beyond native word size, there is little reason to expect platform-dependent
> bugs.  Of course, anything can happen.  If you know or suspect there are bugs,
> you can disable the new algorithms by importing and setting TextClass.Old:=TRUE.
> This will use all the old algorithms instead.  This will suffer small,
> constant-factor time losses relative to the unmodified implementation because
> of runtime testing of TextClass.Old and also a good bit of gathering of raw
> performance statistics.
> 
> You can turn TextClass.Old on and off at will, whenever no thread is executing
> a Text operation.  The results of old and new algorithms are fully interchangeable
> as operands of any Text operation.
> 
> You can use other global variables in TextClass to tune the algorithms.
> Setting TextClass.Flatten:=FALSE disables partial flattening of concatenation
> trees.  Otherwise, TextClass.MaxFlat8, and TextClass.MaxFlatWide set the
> maximum lengths of internal open arrays of CHAR and WIDECHAR, respectively.
> These latter are limits on how much flattening is done.
> 
> You can approximately simulate the behavior of older pm3 Text implementation
> by setting these to LAST(INTEGER).  This will always fully flatten every
> concatenated string.  Differences from pm3 are that the cm3 data structures
> require that a separate heap object be in front of the open array, and that
> this will handle WIDECHAR elements in a TEXT.
> 
> There is also an extensive test program in m3-libs/m3core/tests/newtext/src.
> build it and run with -h to see its options.  It does large numbers of
> random string operations, running the old and new algorithms side-by-side
> and comparing the abstract values of their results  It also reports a lot of
> statistics on time and space usage.  Retained heap storage numbers are now
> running high, apparently due to the inability to force garbage collection
> to complete.

Great!

Thanks for that commit,

Olaf
-- 
Olaf Wagner -- elego Software Solutions GmbH -- http://www.elegosoft.com
               Gustav-Meyer-Allee 25 / Gebäude 12, 13355 Berlin, Germany
phone: +49 30 23 45 86 96  mobile: +49 177 2345 869  fax: +49 30 23 45 86 95
Geschäftsführer: Michael Diers, Olaf Wagner | Sitz: Berlin
Handelregister: Amtsgericht Charlottenburg HRB 77719 | USt-IdNr: DE163214194




More information about the M3announce mailing list