[M3announce] New TEXT algorithms checked in
Olaf Wagner
wagner at elegosoft.com
Fri May 24 11:42:38 CEST 2013
On Thu, 23 May 2013 16:34:02 -0500
"Rodney M. Bates" <rodney_bates at lcwb.coop> wrote:
> I have checked in changes to the Text implementation in the head.
> These make no change to the cm3 data structure or its invariants, only
> different choices of alternate representations for a given abstract
> character string. The major changes are in Cat, which, while O(1),
> created representations inefficient to access. This was particularly
> bad (O(n)) after building a TEXT value by a linear series of concatenations,
> either left-to-right or right-to-left. These were also very extravagant in
> heap space usage.
>
> To get the changes built and running on your machine, just build and ship
> m3core, using cm3 or scripts/do-cm3-min.sh.
>
> As you might expect with any purely functional style abstraction, there
> are gains and losses in space and time performance, depending greatly on
> the usage pattern. Overall, there are mostly small to large net gains
> on balanced mixtures of creating and accessing strings, both in time
> and space. When strings are built linearly, net speed gains around 4-to-1
> are common.
>
> The new algorithms are extensively tested on LINUXLIBC6 and AMD64_LINUX.
> Beyond native word size, there is little reason to expect platform-dependent
> bugs. Of course, anything can happen. If you know or suspect there are bugs,
> you can disable the new algorithms by importing and setting TextClass.Old:=TRUE.
> This will use all the old algorithms instead. This will suffer small,
> constant-factor time losses relative to the unmodified implementation because
> of runtime testing of TextClass.Old and also a good bit of gathering of raw
> performance statistics.
>
> You can turn TextClass.Old on and off at will, whenever no thread is executing
> a Text operation. The results of old and new algorithms are fully interchangeable
> as operands of any Text operation.
>
> You can use other global variables in TextClass to tune the algorithms.
> Setting TextClass.Flatten:=FALSE disables partial flattening of concatenation
> trees. Otherwise, TextClass.MaxFlat8, and TextClass.MaxFlatWide set the
> maximum lengths of internal open arrays of CHAR and WIDECHAR, respectively.
> These latter are limits on how much flattening is done.
>
> You can approximately simulate the behavior of older pm3 Text implementation
> by setting these to LAST(INTEGER). This will always fully flatten every
> concatenated string. Differences from pm3 are that the cm3 data structures
> require that a separate heap object be in front of the open array, and that
> this will handle WIDECHAR elements in a TEXT.
>
> There is also an extensive test program in m3-libs/m3core/tests/newtext/src.
> build it and run with -h to see its options. It does large numbers of
> random string operations, running the old and new algorithms side-by-side
> and comparing the abstract values of their results It also reports a lot of
> statistics on time and space usage. Retained heap storage numbers are now
> running high, apparently due to the inability to force garbage collection
> to complete.
Great!
Thanks for that commit,
Olaf
--
Olaf Wagner -- elego Software Solutions GmbH -- http://www.elegosoft.com
Gustav-Meyer-Allee 25 / Gebäude 12, 13355 Berlin, Germany
phone: +49 30 23 45 86 96 mobile: +49 177 2345 869 fax: +49 30 23 45 86 95
Geschäftsführer: Michael Diers, Olaf Wagner | Sitz: Berlin
Handelregister: Amtsgericht Charlottenburg HRB 77719 | USt-IdNr: DE163214194
More information about the M3announce
mailing list