[M3devel] small array in modula-3?

Fri Nov 16 01:39:18 CET 2007

Jay wrote:

 > What is the right way to have a variably sized but always small array in Modula-3?
 > My array will only ever have 1 or 2 elements.
 > I'd like to always allocate room for 2 elements, and have there be a size.
 >
 > It seems I have a choice of
 >
 > a) an "open" array

Heap allocated, I presume?  If the max size is only 2 elements,
this is pretty extravagant, as a heap allocated open array will
have 4 extra behind-the-scenes words of space overhead, plus
maybe heap fragmentation, and time overhead of allocation,
collection, and maybe reduced locality of reference.

 > b) wrap it up in a record
 >
 > I'd like so have, like:
 >
 > TYPE A = ARRAY [0..1] OF FOO;
 >
 > And be able to say:
 >
 > VAR
 >  a : A;
 >
 >  ..
 >  a.size
 >  FOR i := 0 TO a.size DO
 >    do something with a[i]
 >
 > It seems I have no option but, like:
 >
 > TYPE A = RECORD
 >   a: ARRAY[0..1] OF FOO;
 >   size := 1; (* usually of size 1, sometimes 2 *)
 > END
 >
 > and then
 >  FOR i := 0 TO a.size DO
 >   do something with a.a[i];
 >
 > That is "ok". Not great -- what to call the inner a?
 > But then furthermore, I'd like to construct constants.
 > It seems I am stuck with either the verbose:
 >
 > PROCEDURE F(a:A);
 >
 > F( A { ARRAY[0..1] OF FOO { .. } );
 >
 > OR I have to come up with a name for the embedded array.
 > But of course, its type is really "the same" as the outer type.

Ignoring the fact that the types are different linguistically
(one a record type, the other an array), the types are different
at a deep semantic level too.  The inner array has static
size, the outer one (call it a "variable array", perhaps)
has dynamically changeable size, even more easily changed
after it is created than a heap-allocated open array.  This
is a significant semantic difference, so it also makes program
design sense to view them as different types.

 >
 > To wit:
 >
 > TYPE FooArray = ARRAY[0..1] OF Foo;
 >
 > TYPE FooArray = RECORD
 >   a : FooArray;
 >   size := 1;
 > END;
 >
 > F( FooArray { FooArray { .. } );
 >
 > But I have to come up with different names.
 >
 > And I don't think I can leave out the name for the constructor, like:
 >
 > F( FooArray { { .. } );

Yes, this is a limitation in Modula-3.  Ada allows value constructors
to omit inner type names in cases like this, and it is convenient.  But
it also crosses a really major line on complexity of the language semantics,
since type analysis information now flows not only upward, but also downward
in expression typing.  I am mostly content to live with this as one bit of
the price of avoiding a horribly over complex language.

Ordinary procedures can do pretty much all of the things done by exotic
and complex language stuff like C++ constructors.  If you don't want to write
the ponderous nested value constructor, write a constructor procedure and
call it.  As Antony suggested, you can make it accept an open array formal,
which then codes just like a simple array value constructor.  You could even
make it elaborate and general, e.g.:

PROCEDURE MakeF ( Val : ARRAY OF FOO ) : F

= VAR LSize : CARDINAL
; VAR LResult : A

; BEGIN
     LSize := NUMBER ( Val )
     <* ASSERT LSize <= NUMBER ( FooArray ) *>
   ; LResult . size := LSize
   ; SUBARRAY ( LResult . a , 0 , LSize )
       := SUBARRAY ( Val , FIRST ( Val ) , LSize )
   ; RETURN LResult
   END MakeF

Alternatively, since your maximum element count is so small, you could
write a constructor procedure that took two formals of type FOO, with
at least the second one optional.  (This would require a distinguished
value of type FOO that would be used to mean "omitted".)

And, of course, if you want to use abstraction, you can put the types,
constructor procedures, and various other procedures for manipulating
the variable sized array in a module, behind an interface.

An additional limitation of Modula-3 in this regard, is that you can't
make the type F opaque, unless you heap allocate it, which we were trying
to avoid.  Ada would allow you to do this, but it's another example of
a convenience with high cost.  It forces the equivalent of the revelation
to be located in the equivalent of the interface, but makes it illegal
to write client code that depends on what the revelation is.

This in turn creates a source code control nightmare in a large project,
because now someone who wants to make what is really a purely internal,
implementation change, nevertheless has to check out the interface, and
if you aren't in denial mode, that means implementers of all the client
code have to go to extra trouble to somehow find out that what appears
to be an interface change actually doesn't affect them after all.

 >
 > Though that might seem nice.
 >
 > Am I understanding everything?
 >
 > Have folks hit this before and there's a set of names that don't seem too lame
 > that folks use?
 >
 > Also -- language documentation?
 > Nelson's green book is excellent.
 > The stuff in the doc directory is very dry and scientific.
 > The tutorial seems more like a reference. Or maybe I didn't read it enough.
 > Is there better, in case I need a refresher?
 > I think I got it, via Nelson's book from memory (wonder if I can find mine..) and the reference,
 > but I don't think anyone could learn from the current online docs in the source tree.
 >
 > Maybe it's me, some combination of laziness and short attention span
 >  as I age..
 >
 > Btw, C doesn't offer a great solution here, though it offers leaving out some type names.
 > In C++ I could have both .size and operator[].
 > Modula-3 seems to be missing operator overloading.
 > It's got some builtin stuff, even array assignment and equality and I think
 >   even record assignment and equality, but it is still a bit limiting.

I have raved on this before, but, having had lots of painful experience with
user-defined overloading in Ada and C++, I can say with authority, that it is
a programming language disaster.  It interacts with just about everything
else in the language, in very complicated ways, and all it buys you is saving
time/energy thinking up distinct names for procedures/functions.  Even this,
aside from the effects on the language, is a net loss, by the time you consider
readability along with writability.

User-defined overloaded operators provide a slight readability benefit, _if_
used with great restraint and discipline, something you can rely on not happening.
Meanwhile, the language complexity can easily double or worse.  And, with the
exception of compiler writers and language lawyers who spend hundreds of hours
on just this, programmers don't understand the rules, not even close.

 >
 > Also, if I understand things correctly, this has long bothered me about Modula-3,
 >  though I'm more tolerant now -- Modula-3 doesn't seem to allow for lighter wieght
 >  objects. Like, small stack allocated structs with member functions. You seem to have
 >  to chose between heap allocated garbage collected virtual member functions full
 >  featured objects, or featureless dumb structs. It's nice how C++ allow hybrids --
 >  objects don't have to be heap allocated and member functions don't have be virtual.
 >  Or am I missing something?

C++'s supposedly lighter weight forms of classes/structs with their special
member functions buy nothing that plain records, plain procedures, and
interfaces/modules don't already provide, again, at significant and gratuitous
language complexity.  Except when methods/member functions actually dispatch
dynamically, there is nothing that the above won't do, and when you create a
class instance as a local variable (i.e., on the stack), it is necessarily
not polymorphic, that is, it can't change its "allocated" or dynamic type
among various subtypes at runtime.  This in turn means it can't dispatch.

So, just use plain procedures, an interface and a module, and you will get
everything a lighter-weight C++ class would give you.

 >
 > Anyway, I've gotten to accept C a bit more vs. C++ so I can deal with
 > t: Type;
 > Type_DoSomething(t);
 >
 > in place of:
 > t.DoSomething();

Exactly.  The special "receiver object" in a true method call has significant
semantic differences from an ordinary parameter and introduces new complexities
and non-orthogonalities.  It has some very valuable uses too.

But to then create degenerate forms of it that still carry a lot of these
complexities, but are equivalent in programing power to plain old
parameters is just bad program design and bad language design.  Objects and
methods are indeed cool for situations that utilize their semantic complexity.
But it's deeply uncool to try to look superficially cool by using an
inappropriately sophisticated construct when the problem doesn't justify it.
Some OO proponents have gone way over the deep end here.

 >
 >  - Jay
 >
 >
 > ------------------------------------------------------------------------
 > Boo! Scare away worms, viruses and so much more! Try Windows Live OneCare! Try now! 
<http://onecare.live.com/standard/en-us/purchase/trial.aspx?s_cid=wl_hotmailnews>

-- 
-------------------------------------------------------------
Rodney M. Bates, retired assistant professor
Dept. of Computer Science, Wichita State University
Wichita, KS 67260-0083
316-978-3922
rodney.bates at wichita.edu

-- 
-------------------------------------------------------------
Rodney M. Bates, retired assistant professor
Dept. of Computer Science, Wichita State University
Wichita, KS 67260-0083
316-978-3922
rodney.bates at wichita.edu