[M3devel] This disgusting TEXT business

Mika Nystrom mika at async.caltech.edu
Sat Dec 20 08:19:02 CET 2008


Hello Modula-3ers,

I have gone on the record before as not being very impressed by the
Critical Mass implementation of TEXT.  

If you bother to read this email to the end, I will show you a
simple program that runs 100 times slower when compiled with 
CM3 than the older PM3.

TEXT is a very convenient built-in type, especially I think because
it allows literals to be written with a lot less work than 

literal := ARRAY OF CHAR { 'h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd', '!', '\n' }

Hence a lot of programs use TEXT liberally.  

Modula-3 also has a very syntactically convenient TEXT concatenation
operator, &.

I think it was Guy Steele who observed in one of the Lambda Papers that
programmers tend to think syntactically concise notation is associated
with efficient implementation.

CM3 fails here, and sometimes horribly!

I remember about five years ago we were writing a large number of
VLSI CAD tools in Modula-3 at Caltech.  We were going to "upgrade"
to CM3---I think one of us wanted to run the tools on a Macintosh.
Most VLSI CAD tools really do nothing but process strings in various
ways.  Concatenation, splitting, those sorts of things.  Think a
large program written in a language without recursion but lots of
hierarchical names.  Compile unit x containing subunit y containing
subunit z containing node n, giving name x.y.z.n .

Well to make a long story short, the programmer with the Mac said
that CM3 produced horribly slow-running binaries.  Something with
TEXTs, but we didn't investigate further.  We gave up on migrating
our group to CM3 and instead use PM3 to this day.  The programmer
with the Mac has since switched all his work to coding in C++ "in
the style of Modula-3."

And after seeing what I have seen now I cannot say I will voluntarily
use CM3 for anything until we've got a solution.  I would personally
be happy to rip out everything that says Critical Mass TEXT on it and
put back the SRC code.

Here's the program I promised:

MODULE Main;
IMPORT Text, Fmt, IO, Time;

CONST 
  Count = 1000*1000;

VAR
  checks := 0;
  start := Time.Now();
  a : TEXT := "";
BEGIN

  FOR i := 0 TO 100 DO
    a := a & ":" & Fmt.Int(i)
  END;

  LOOP
    EVAL Text.Equal(a,a);
    INC(checks);
    IF checks > Count THEN
      IO.Put("rate = " & Fmt.LongReal(FLOAT(Count,LONGREAL) / 
                                      (Time.Now() - start)) & "\n");
      start := Time.Now();
      checks := 0
    END
  END

END Main.

On the same machine, running FreeBSD 5.5-RELEASE with a Core2 Duo
E8400 @ 3.00 GHz and 2 GB of RAM, I get the following performance:

  virtual  resident  Text.Equals            machine cycles
  size      size      per second  compiler   per char (approx)
---------------------------------------------------------------
   11476K     2396K       13,274    CM3         770
    1888K      960K    1,634,000    PM3           6.24

The culprit is TextCat.MyGetWideChars, which calls itself recursively.
Here's a representative traceback:

Starting program: /big/home/mika/test_text/FreeBSD4/test_text 
^C
Program received signal SIGINT, Interrupt.
0x681f51f1 in MyGetWideChars (t=16_68b188a8, a=
    [W'8',W':',W'8',W'9',W':',W'9',W'0',W':',W'9',W'1',W':',W'9',W'2',W':',W'9',W'3',W':',W'9',W'4',W':',W'9',W'5',W':',W
'9',W'6',W'5',W':',W'7',W'6',W':',W'7',W'7',W':',W'7',W'8',W':',W'7',W'9',W':',W'8',W'0',W':',W'8',W'1',W':',W'8',W'2',W'
:',W'8',W'3',W':',W'8',W'4',W':',W'8',W'5',W':',W'8',W'6',W':',W'8',W'7',W':',W'8'], start=256) at TextCat.m3:111
111     TextCat.m3: No such file or directory.
        in TextCat.m3
(m3gdb) where
#0  0x681f51f1 in MyGetWideChars (t=16_68b188a8, a=
    [W'8',W':',W'8',W'9',W':',W'9',W'0',W':',W'9',W'1',W':',W'9',W'2',W':',W'9',W'3',W':',W'9',W'4',W':',W'9',W'5',W':',W
'9',W'6',W'5',W':',W'7',W'6',W':',W'7',W'7',W':',W'7',W'8',W':',W'7',W'9',W':',W'8',W'0',W':',W'8',W'1',W':',W'8',W'2',W'
:',W'8',W'3',W':',W'8',W'4',W':',W'8',W'5',W':',W'8',W'6',W':',W'8',W'7',W':',W'8'], start=256) at TextCat.m3:111
#1  0x681f51b3 in MyGetWideChars (t=16_68b188c4, a=
    [W'8',W':',W'8',W'9',W':',W'9',W'0',W':',W'9',W'1',W':',W'9',W'2',W':',W'9',W'3',W':',W'9',W'4',W':',W'9',W'5',W':',W
'9',W'6',W'5',W':',W'7',W'6',W':',W'7',W'7',W':',W'7',W'8',W':',W'7',W'9',W':',W'8',W'0',W':',W'8',W'1',W':',W'8',W'2',W'
:',W'8',W'3',W':',W'8',W'4',W':',W'8',W'5',W':',W'8',W'6',W':',W'8',W'7',W':',W'8'], start=256) at TextCat.m3:109
#2  0x681f51b3 in MyGetWideChars (t=16_68b188e0, a=
    [W'8',W':',W'8',W'9',W':',W'9',W'0',W':',W'9',W'1',W':',W'9',W'2',W':',W'9',W'3',W':',W'9',W'4',W':',W'9',W'5',W':',W
'9',W'6',W'5',W':',W'7',W'6',W':',W'7',W'7',W':',W'7',W'8',W':',W'7',W'9',W':',W'8',W'0',W':',W'8',W'1',W':',W'8',W'2',W'
:',W'8',W'3',W':',W'8',W'4',W':',W'8',W'5',W':',W'8',W'6',W':',W'8',W'7',W':',W'8'], start=256) at TextCat.m3:109
#3  0x681f51b3 in MyGetWideChars (t=16_68b188fc, a=
    [W'8',W':',W'8',W'9',W':',W'9',W'0',W':',W'9',W'1',W':',W'9',W'2',W':',W'9',W'3',W':',W'9',W'4',W':',W'9',W'5',W':',W
'9',W'6',W'5',W':',W'7',W'6',W':',W'7',W'7',W':',W'7',W'8',W':',W'7',W'9',W':',W'8',W'0',W':',W'8',W'1',W':',W'8',W'2',W'
:',W'8',W'3',W':',W'8',W'4',W':',W'8',W'5',W':',W'8',W'6',W':',W'8',W'7',W':',W'8'], start=256) at TextCat.m3:109
#4  0x681f51b3 in MyGetWideChars (t=16_68b18918, a=
    [W'8',W':',W'8',W'9',W':',W'9',W'0',W':',W'9',W'1',W':',W'9',W'2',W':',W'9',W'3',W':',W'9',W'4',W':',W'9',W'5',W':',W
'9',W'6',W'5',W':',W'7',W'6',W':',W'7',W'7',W':',W'7',W'8',W':',W'7',W'9',W':',W'8',W'0',W':',W'8',W'1',W':',W'8',W'2',W'
:',W'8',W'3',W':',W'8',W'4',W':',W'8',W'5',W':',W'8',W'6',W':',W'8',W'7',W':',W'8'], start=256) at TextCat.m3:109
#5  0x681f51b3 in MyGetWideChars (t=16_68b18934, a=
    [W'8',W':',W'8',W'9',W':',W'9',W'0',W':',W'9',W'1',W':',W'9',W'2',W':',W'9',W'3',W':',W'9',W'4',W':',W'9',W'5',W':',W
'9',W'6',W'5',W':',W'7',W'6',W':',W'7',W'7',W':',W'7',W'8',W':',W'7',W'9',W':',W'8',W'0',W':',W'8',W'1',W':',W'8',W'2',W'
:',W'8',W'3',W':',W'8',W'4',W':',W'8',W'5',W':',W'8',W'6',W':',W'8',W'7',W':',W'8'], start=256) at TextCat.m3:109
#6  0x681f51b3 in MyGetWideChars (t=16_68b18950, a=
    [W'8',W':',W'8',W'9',W':',W'9',W'0',W':',W'9',W'1',W':',W'9',W'2',W':',W'9',W'3',W':',W'9',W'4',W':',W'9',W'5',W':',W
'9',W'6',W'5',W':',W'7',W'6',W':',W'7',W'7',W':',W'7',W'8',W':',W'7',W'9',W':',W'8',W'0',W':',W'8',W'1',W':',W'8',W'2',W'
:',W'8',W'3',W':',W'8',W'4',W':',W'8',W'5',W':',W'8',W'6',W':',W'8',W'7',W':',W'8'], start=256) at TextCat.m3:109
#7  0x681f51b3 in MyGetWideChars (t=16_68b1896c, a=
    [W'8',W':',W'8',W'9',W':',W'9',W'0',W':',W'9',W'1',W':',W'9',W'2',W':',W'9',W'3',W':',W'9',W'4',W':',W'9',W'5',W':',W
'9',W'6',W'5',W':',W'7',W'6',W':',W'7',W'7',W':',W'7',W'8',W':',W'7',W'9',W':',W'8',W'0',W':',W'8',W'1',W':',W'8',W'2',W'
:',W'8',W'3',W':',W'8',W'4',W':',W'8',W'5',W':',W'8',W'6',W':',W'8',W'7',W':',W'8'], start=256) at TextCat.m3:109
#8  0x681f0d23 in EqualBuf (t=16_68b1896c, u=16_68b1896c, len=294) at Text.m3:66
#9  0x681f0b05 in Equal (t=16_68b1896c, u=16_68b1896c) at Text.m3:39
#10 0x08048b0d in Main (mode=1) at Main.m3:20
#11 0x681d24e2 in RunMainBody (m=16_08049d60) at RTLinker.m3:399
#12 0x681d1a0d in AddUnitI (m=16_08049d60) at RTLinker.m3:113
#13 0x681d1a94 in AddUnit (b={"Main_M3", Declared at: Main.m3:13}) at RTLinker.m3:122
#14 0x08048a41 in main (argc=1, argv=0xbfbfe2a8, envp=0xbfbfe2b0) at _m3main.mc:4
(m3gdb) 

Now you may object that perhaps people don't normally make TEXTs
by concatenating 100 small sub-TEXTs.  Maybe not.. but sometimes
they do.  There's nothing to warn the user that that is a terribly
bad thing to do.  Well we know... we think.. it will lead to a lot
of memory allocation overhead when you *make* the TEXT.  But that
it will infect the generated TEXT forevermore with this long chain
of pointers is, to say the least, surprising.

The reason I noticed this now is that I wrote a program that "watches"
a directory of files (using FS.Iterate).  It even uses a hash table
to store the TEXTs.  *Still* it is an incredible CPU hog (on Linux
wth CM3).  I think I just did

  path :=  dir & "/" & filename

and that's enough to make it run 10-30 times slower than a PM3-compiled
binary from the same source code (for suitably pathological choices
of dir and filename).

Does anyone have a better idea than just throwing away the CM3 TEXT 
implementation?  

    Mika



More information about the M3devel mailing list