From rodney_bates at lcwb.coop Thu May 23 23:34:02 2013 From: rodney_bates at lcwb.coop (Rodney M. Bates) Date: Thu, 23 May 2013 16:34:02 -0500 Subject: [M3announce] New TEXT algorithms checked in Message-ID: <519E8B4A.8030107@lcwb.coop> I have checked in changes to the Text implementation in the head. These make no change to the cm3 data structure or its invariants, only different choices of alternate representations for a given abstract character string. The major changes are in Cat, which, while O(1), created representations inefficient to access. This was particularly bad (O(n)) after building a TEXT value by a linear series of concatenations, either left-to-right or right-to-left. These were also very extravagant in heap space usage. To get the changes built and running on your machine, just build and ship m3core, using cm3 or scripts/do-cm3-min.sh. As you might expect with any purely functional style abstraction, there are gains and losses in space and time performance, depending greatly on the usage pattern. Overall, there are mostly small to large net gains on balanced mixtures of creating and accessing strings, both in time and space. When strings are built linearly, net speed gains around 4-to-1 are common. The new algorithms are extensively tested on LINUXLIBC6 and AMD64_LINUX. Beyond native word size, there is little reason to expect platform-dependent bugs. Of course, anything can happen. If you know or suspect there are bugs, you can disable the new algorithms by importing and setting TextClass.Old:=TRUE. This will use all the old algorithms instead. This will suffer small, constant-factor time losses relative to the unmodified implementation because of runtime testing of TextClass.Old and also a good bit of gathering of raw performance statistics. You can turn TextClass.Old on and off at will, whenever no thread is executing a Text operation. The results of old and new algorithms are fully interchangeable as operands of any Text operation. You can use other global variables in TextClass to tune the algorithms. Setting TextClass.Flatten:=FALSE disables partial flattening of concatenation trees. Otherwise, TextClass.MaxFlat8, and TextClass.MaxFlatWide set the maximum lengths of internal open arrays of CHAR and WIDECHAR, respectively. These latter are limits on how much flattening is done. You can approximately simulate the behavior of older pm3 Text implementation by setting these to LAST(INTEGER). This will always fully flatten every concatenated string. Differences from pm3 are that the cm3 data structures require that a separate heap object be in front of the open array, and that this will handle WIDECHAR elements in a TEXT. There is also an extensive test program in m3-libs/m3core/tests/newtext/src. build it and run with -h to see its options. It does large numbers of random string operations, running the old and new algorithms side-by-side and comparing the abstract values of their results It also reports a lot of statistics on time and space usage. Retained heap storage numbers are now running high, apparently due to the inability to force garbage collection to complete. From rodney_bates at lcwb.coop Fri May 24 02:24:58 2013 From: rodney_bates at lcwb.coop (Rodney M. Bates) Date: Thu, 23 May 2013 19:24:58 -0500 Subject: [M3announce] New VarArray package Message-ID: <519EB35A.5000208@lcwb.coop> There is now a new generic package named VarArray, that provides heavy weight but very flexible self-expanding arrays. They can be indexed by any ordinal type and have any element type except open arrays. As side-effects of various operations, they keep track of the range of subscripts that have been stored into. Occasionally, they reallocate the underlying array provided by the implementation, also as a side-effect. Subscript values can have ORD values in the entire range of INTEGER, although obviously not this many elements can be occupied at one time. There are also some ways of exerting manual control over space allocation and a less safe but low-level, more efficient method of accessing. A companion generic package Ranges is used by VarArray, but could possibly have other uses on its own. There is a test program for exercising them. All is located in m3-libs/vararray. From wagner at elegosoft.com Fri May 24 11:42:38 2013 From: wagner at elegosoft.com (Olaf Wagner) Date: Fri, 24 May 2013 11:42:38 +0200 Subject: [M3announce] New TEXT algorithms checked in In-Reply-To: <519E8B4A.8030107@lcwb.coop> References: <519E8B4A.8030107@lcwb.coop> Message-ID: <20130524114238.031ceea23aff4bd74f34c5ab@elegosoft.com> On Thu, 23 May 2013 16:34:02 -0500 "Rodney M. Bates" wrote: > I have checked in changes to the Text implementation in the head. > These make no change to the cm3 data structure or its invariants, only > different choices of alternate representations for a given abstract > character string. The major changes are in Cat, which, while O(1), > created representations inefficient to access. This was particularly > bad (O(n)) after building a TEXT value by a linear series of concatenations, > either left-to-right or right-to-left. These were also very extravagant in > heap space usage. > > To get the changes built and running on your machine, just build and ship > m3core, using cm3 or scripts/do-cm3-min.sh. > > As you might expect with any purely functional style abstraction, there > are gains and losses in space and time performance, depending greatly on > the usage pattern. Overall, there are mostly small to large net gains > on balanced mixtures of creating and accessing strings, both in time > and space. When strings are built linearly, net speed gains around 4-to-1 > are common. > > The new algorithms are extensively tested on LINUXLIBC6 and AMD64_LINUX. > Beyond native word size, there is little reason to expect platform-dependent > bugs. Of course, anything can happen. If you know or suspect there are bugs, > you can disable the new algorithms by importing and setting TextClass.Old:=TRUE. > This will use all the old algorithms instead. This will suffer small, > constant-factor time losses relative to the unmodified implementation because > of runtime testing of TextClass.Old and also a good bit of gathering of raw > performance statistics. > > You can turn TextClass.Old on and off at will, whenever no thread is executing > a Text operation. The results of old and new algorithms are fully interchangeable > as operands of any Text operation. > > You can use other global variables in TextClass to tune the algorithms. > Setting TextClass.Flatten:=FALSE disables partial flattening of concatenation > trees. Otherwise, TextClass.MaxFlat8, and TextClass.MaxFlatWide set the > maximum lengths of internal open arrays of CHAR and WIDECHAR, respectively. > These latter are limits on how much flattening is done. > > You can approximately simulate the behavior of older pm3 Text implementation > by setting these to LAST(INTEGER). This will always fully flatten every > concatenated string. Differences from pm3 are that the cm3 data structures > require that a separate heap object be in front of the open array, and that > this will handle WIDECHAR elements in a TEXT. > > There is also an extensive test program in m3-libs/m3core/tests/newtext/src. > build it and run with -h to see its options. It does large numbers of > random string operations, running the old and new algorithms side-by-side > and comparing the abstract values of their results It also reports a lot of > statistics on time and space usage. Retained heap storage numbers are now > running high, apparently due to the inability to force garbage collection > to complete. Great! Thanks for that commit, Olaf -- Olaf Wagner -- elego Software Solutions GmbH -- http://www.elegosoft.com Gustav-Meyer-Allee 25 / Geb?ude 12, 13355 Berlin, Germany phone: +49 30 23 45 86 96 mobile: +49 177 2345 869 fax: +49 30 23 45 86 95 Gesch?ftsf?hrer: Michael Diers, Olaf Wagner | Sitz: Berlin Handelregister: Amtsgericht Charlottenburg HRB 77719 | USt-IdNr: DE163214194 From rodney_bates at lcwb.coop Thu May 23 23:34:02 2013 From: rodney_bates at lcwb.coop (Rodney M. Bates) Date: Thu, 23 May 2013 16:34:02 -0500 Subject: [M3announce] New TEXT algorithms checked in Message-ID: <519E8B4A.8030107@lcwb.coop> I have checked in changes to the Text implementation in the head. These make no change to the cm3 data structure or its invariants, only different choices of alternate representations for a given abstract character string. The major changes are in Cat, which, while O(1), created representations inefficient to access. This was particularly bad (O(n)) after building a TEXT value by a linear series of concatenations, either left-to-right or right-to-left. These were also very extravagant in heap space usage. To get the changes built and running on your machine, just build and ship m3core, using cm3 or scripts/do-cm3-min.sh. As you might expect with any purely functional style abstraction, there are gains and losses in space and time performance, depending greatly on the usage pattern. Overall, there are mostly small to large net gains on balanced mixtures of creating and accessing strings, both in time and space. When strings are built linearly, net speed gains around 4-to-1 are common. The new algorithms are extensively tested on LINUXLIBC6 and AMD64_LINUX. Beyond native word size, there is little reason to expect platform-dependent bugs. Of course, anything can happen. If you know or suspect there are bugs, you can disable the new algorithms by importing and setting TextClass.Old:=TRUE. This will use all the old algorithms instead. This will suffer small, constant-factor time losses relative to the unmodified implementation because of runtime testing of TextClass.Old and also a good bit of gathering of raw performance statistics. You can turn TextClass.Old on and off at will, whenever no thread is executing a Text operation. The results of old and new algorithms are fully interchangeable as operands of any Text operation. You can use other global variables in TextClass to tune the algorithms. Setting TextClass.Flatten:=FALSE disables partial flattening of concatenation trees. Otherwise, TextClass.MaxFlat8, and TextClass.MaxFlatWide set the maximum lengths of internal open arrays of CHAR and WIDECHAR, respectively. These latter are limits on how much flattening is done. You can approximately simulate the behavior of older pm3 Text implementation by setting these to LAST(INTEGER). This will always fully flatten every concatenated string. Differences from pm3 are that the cm3 data structures require that a separate heap object be in front of the open array, and that this will handle WIDECHAR elements in a TEXT. There is also an extensive test program in m3-libs/m3core/tests/newtext/src. build it and run with -h to see its options. It does large numbers of random string operations, running the old and new algorithms side-by-side and comparing the abstract values of their results It also reports a lot of statistics on time and space usage. Retained heap storage numbers are now running high, apparently due to the inability to force garbage collection to complete. From rodney_bates at lcwb.coop Fri May 24 02:24:58 2013 From: rodney_bates at lcwb.coop (Rodney M. Bates) Date: Thu, 23 May 2013 19:24:58 -0500 Subject: [M3announce] New VarArray package Message-ID: <519EB35A.5000208@lcwb.coop> There is now a new generic package named VarArray, that provides heavy weight but very flexible self-expanding arrays. They can be indexed by any ordinal type and have any element type except open arrays. As side-effects of various operations, they keep track of the range of subscripts that have been stored into. Occasionally, they reallocate the underlying array provided by the implementation, also as a side-effect. Subscript values can have ORD values in the entire range of INTEGER, although obviously not this many elements can be occupied at one time. There are also some ways of exerting manual control over space allocation and a less safe but low-level, more efficient method of accessing. A companion generic package Ranges is used by VarArray, but could possibly have other uses on its own. There is a test program for exercising them. All is located in m3-libs/vararray. From wagner at elegosoft.com Fri May 24 11:42:38 2013 From: wagner at elegosoft.com (Olaf Wagner) Date: Fri, 24 May 2013 11:42:38 +0200 Subject: [M3announce] New TEXT algorithms checked in In-Reply-To: <519E8B4A.8030107@lcwb.coop> References: <519E8B4A.8030107@lcwb.coop> Message-ID: <20130524114238.031ceea23aff4bd74f34c5ab@elegosoft.com> On Thu, 23 May 2013 16:34:02 -0500 "Rodney M. Bates" wrote: > I have checked in changes to the Text implementation in the head. > These make no change to the cm3 data structure or its invariants, only > different choices of alternate representations for a given abstract > character string. The major changes are in Cat, which, while O(1), > created representations inefficient to access. This was particularly > bad (O(n)) after building a TEXT value by a linear series of concatenations, > either left-to-right or right-to-left. These were also very extravagant in > heap space usage. > > To get the changes built and running on your machine, just build and ship > m3core, using cm3 or scripts/do-cm3-min.sh. > > As you might expect with any purely functional style abstraction, there > are gains and losses in space and time performance, depending greatly on > the usage pattern. Overall, there are mostly small to large net gains > on balanced mixtures of creating and accessing strings, both in time > and space. When strings are built linearly, net speed gains around 4-to-1 > are common. > > The new algorithms are extensively tested on LINUXLIBC6 and AMD64_LINUX. > Beyond native word size, there is little reason to expect platform-dependent > bugs. Of course, anything can happen. If you know or suspect there are bugs, > you can disable the new algorithms by importing and setting TextClass.Old:=TRUE. > This will use all the old algorithms instead. This will suffer small, > constant-factor time losses relative to the unmodified implementation because > of runtime testing of TextClass.Old and also a good bit of gathering of raw > performance statistics. > > You can turn TextClass.Old on and off at will, whenever no thread is executing > a Text operation. The results of old and new algorithms are fully interchangeable > as operands of any Text operation. > > You can use other global variables in TextClass to tune the algorithms. > Setting TextClass.Flatten:=FALSE disables partial flattening of concatenation > trees. Otherwise, TextClass.MaxFlat8, and TextClass.MaxFlatWide set the > maximum lengths of internal open arrays of CHAR and WIDECHAR, respectively. > These latter are limits on how much flattening is done. > > You can approximately simulate the behavior of older pm3 Text implementation > by setting these to LAST(INTEGER). This will always fully flatten every > concatenated string. Differences from pm3 are that the cm3 data structures > require that a separate heap object be in front of the open array, and that > this will handle WIDECHAR elements in a TEXT. > > There is also an extensive test program in m3-libs/m3core/tests/newtext/src. > build it and run with -h to see its options. It does large numbers of > random string operations, running the old and new algorithms side-by-side > and comparing the abstract values of their results It also reports a lot of > statistics on time and space usage. Retained heap storage numbers are now > running high, apparently due to the inability to force garbage collection > to complete. Great! Thanks for that commit, Olaf -- Olaf Wagner -- elego Software Solutions GmbH -- http://www.elegosoft.com Gustav-Meyer-Allee 25 / Geb?ude 12, 13355 Berlin, Germany phone: +49 30 23 45 86 96 mobile: +49 177 2345 869 fax: +49 30 23 45 86 95 Gesch?ftsf?hrer: Michael Diers, Olaf Wagner | Sitz: Berlin Handelregister: Amtsgericht Charlottenburg HRB 77719 | USt-IdNr: DE163214194 From rodney_bates at lcwb.coop Thu May 23 23:34:02 2013 From: rodney_bates at lcwb.coop (Rodney M. Bates) Date: Thu, 23 May 2013 16:34:02 -0500 Subject: [M3announce] New TEXT algorithms checked in Message-ID: <519E8B4A.8030107@lcwb.coop> I have checked in changes to the Text implementation in the head. These make no change to the cm3 data structure or its invariants, only different choices of alternate representations for a given abstract character string. The major changes are in Cat, which, while O(1), created representations inefficient to access. This was particularly bad (O(n)) after building a TEXT value by a linear series of concatenations, either left-to-right or right-to-left. These were also very extravagant in heap space usage. To get the changes built and running on your machine, just build and ship m3core, using cm3 or scripts/do-cm3-min.sh. As you might expect with any purely functional style abstraction, there are gains and losses in space and time performance, depending greatly on the usage pattern. Overall, there are mostly small to large net gains on balanced mixtures of creating and accessing strings, both in time and space. When strings are built linearly, net speed gains around 4-to-1 are common. The new algorithms are extensively tested on LINUXLIBC6 and AMD64_LINUX. Beyond native word size, there is little reason to expect platform-dependent bugs. Of course, anything can happen. If you know or suspect there are bugs, you can disable the new algorithms by importing and setting TextClass.Old:=TRUE. This will use all the old algorithms instead. This will suffer small, constant-factor time losses relative to the unmodified implementation because of runtime testing of TextClass.Old and also a good bit of gathering of raw performance statistics. You can turn TextClass.Old on and off at will, whenever no thread is executing a Text operation. The results of old and new algorithms are fully interchangeable as operands of any Text operation. You can use other global variables in TextClass to tune the algorithms. Setting TextClass.Flatten:=FALSE disables partial flattening of concatenation trees. Otherwise, TextClass.MaxFlat8, and TextClass.MaxFlatWide set the maximum lengths of internal open arrays of CHAR and WIDECHAR, respectively. These latter are limits on how much flattening is done. You can approximately simulate the behavior of older pm3 Text implementation by setting these to LAST(INTEGER). This will always fully flatten every concatenated string. Differences from pm3 are that the cm3 data structures require that a separate heap object be in front of the open array, and that this will handle WIDECHAR elements in a TEXT. There is also an extensive test program in m3-libs/m3core/tests/newtext/src. build it and run with -h to see its options. It does large numbers of random string operations, running the old and new algorithms side-by-side and comparing the abstract values of their results It also reports a lot of statistics on time and space usage. Retained heap storage numbers are now running high, apparently due to the inability to force garbage collection to complete. From rodney_bates at lcwb.coop Fri May 24 02:24:58 2013 From: rodney_bates at lcwb.coop (Rodney M. Bates) Date: Thu, 23 May 2013 19:24:58 -0500 Subject: [M3announce] New VarArray package Message-ID: <519EB35A.5000208@lcwb.coop> There is now a new generic package named VarArray, that provides heavy weight but very flexible self-expanding arrays. They can be indexed by any ordinal type and have any element type except open arrays. As side-effects of various operations, they keep track of the range of subscripts that have been stored into. Occasionally, they reallocate the underlying array provided by the implementation, also as a side-effect. Subscript values can have ORD values in the entire range of INTEGER, although obviously not this many elements can be occupied at one time. There are also some ways of exerting manual control over space allocation and a less safe but low-level, more efficient method of accessing. A companion generic package Ranges is used by VarArray, but could possibly have other uses on its own. There is a test program for exercising them. All is located in m3-libs/vararray. From wagner at elegosoft.com Fri May 24 11:42:38 2013 From: wagner at elegosoft.com (Olaf Wagner) Date: Fri, 24 May 2013 11:42:38 +0200 Subject: [M3announce] New TEXT algorithms checked in In-Reply-To: <519E8B4A.8030107@lcwb.coop> References: <519E8B4A.8030107@lcwb.coop> Message-ID: <20130524114238.031ceea23aff4bd74f34c5ab@elegosoft.com> On Thu, 23 May 2013 16:34:02 -0500 "Rodney M. Bates" wrote: > I have checked in changes to the Text implementation in the head. > These make no change to the cm3 data structure or its invariants, only > different choices of alternate representations for a given abstract > character string. The major changes are in Cat, which, while O(1), > created representations inefficient to access. This was particularly > bad (O(n)) after building a TEXT value by a linear series of concatenations, > either left-to-right or right-to-left. These were also very extravagant in > heap space usage. > > To get the changes built and running on your machine, just build and ship > m3core, using cm3 or scripts/do-cm3-min.sh. > > As you might expect with any purely functional style abstraction, there > are gains and losses in space and time performance, depending greatly on > the usage pattern. Overall, there are mostly small to large net gains > on balanced mixtures of creating and accessing strings, both in time > and space. When strings are built linearly, net speed gains around 4-to-1 > are common. > > The new algorithms are extensively tested on LINUXLIBC6 and AMD64_LINUX. > Beyond native word size, there is little reason to expect platform-dependent > bugs. Of course, anything can happen. If you know or suspect there are bugs, > you can disable the new algorithms by importing and setting TextClass.Old:=TRUE. > This will use all the old algorithms instead. This will suffer small, > constant-factor time losses relative to the unmodified implementation because > of runtime testing of TextClass.Old and also a good bit of gathering of raw > performance statistics. > > You can turn TextClass.Old on and off at will, whenever no thread is executing > a Text operation. The results of old and new algorithms are fully interchangeable > as operands of any Text operation. > > You can use other global variables in TextClass to tune the algorithms. > Setting TextClass.Flatten:=FALSE disables partial flattening of concatenation > trees. Otherwise, TextClass.MaxFlat8, and TextClass.MaxFlatWide set the > maximum lengths of internal open arrays of CHAR and WIDECHAR, respectively. > These latter are limits on how much flattening is done. > > You can approximately simulate the behavior of older pm3 Text implementation > by setting these to LAST(INTEGER). This will always fully flatten every > concatenated string. Differences from pm3 are that the cm3 data structures > require that a separate heap object be in front of the open array, and that > this will handle WIDECHAR elements in a TEXT. > > There is also an extensive test program in m3-libs/m3core/tests/newtext/src. > build it and run with -h to see its options. It does large numbers of > random string operations, running the old and new algorithms side-by-side > and comparing the abstract values of their results It also reports a lot of > statistics on time and space usage. Retained heap storage numbers are now > running high, apparently due to the inability to force garbage collection > to complete. Great! Thanks for that commit, Olaf -- Olaf Wagner -- elego Software Solutions GmbH -- http://www.elegosoft.com Gustav-Meyer-Allee 25 / Geb?ude 12, 13355 Berlin, Germany phone: +49 30 23 45 86 96 mobile: +49 177 2345 869 fax: +49 30 23 45 86 95 Gesch?ftsf?hrer: Michael Diers, Olaf Wagner | Sitz: Berlin Handelregister: Amtsgericht Charlottenburg HRB 77719 | USt-IdNr: DE163214194