[M3devel] Enumeration or subrange value out of range

Wed Dec 1 17:00:10 CET 2010

Jay K wrote:
>>    generics are good, but not as good as C++ templates  

On the contrary, Modula-3 generics are overall, *far* superior to
C++ templates, which are a language complexity nightmare, about
as bad as user-defined overloading.  Modula-3 generics are vastly
simpler, yet more flexible.  (Modula-3 generics do achieve this at
the cost of a small inconvenience.)  Here is a comparison I wrote
some time ago of the generic/template facilities of Ada, C++, and
Modula-3:

-----------------------------------------------------------------------

Some comparisons of generics/templates in Ada, C++, and Modula-3

C++'s template facility is *vastly* more complicated than Modula-3's.
C++(2003) has a 58-page chapter on templates, more that the entire
Modula-3 reference.  Additionally, C++ has a lot of additional
material on templates woven throughout the rest of the language.
Modula-3 devotes 2 pages to its generics and defines them with a
translation to equivalent, generic-free code, so that they need not be
mentioned elsewhere.

Ada's generics are a bit less complicated than C++'s, but not a whole
lot.  Here are some of the sources of the gratuitous complexity:

Confused identifier resolution

Probably the worst source of C++'s template complexity derives from
interaction with the identifier resolution rules it inherits from C.
They were already cobbled-up in C and got worse in C++.

In a single scope, the same identifier can be declared with different
meanings, most importantly, two different ways of naming types
(typedef names, class names), neither of which is sufficient by
itself, and ordinary identifiers (variables and functions).  The rules
for deciding what an identifier reference actually means are highly
dependent on the context of the reference.  Moreover, the very syntax
depends on the surrounding declarations.

The template facility makes this immeasurably worse, because, in a
template, the needed context is not there.  So this is fixed by new
and different syntactic and semantic rules for making the distinction.
In contexts where the syntax does not determine whether a type or
ordinary meaning is needed, the presence or absence of 'typename'
(or a class key) makes the needed distinction.

But there are several contexts where the syntax does imply that a type
is needed, and these are full of inconsistency.  In some, 'typename'
is required, in others, it is optional, and in yet others, it is not
allowed.  So a programmer cannot just establish a habit of always
doing it a certain way.  None of this conveys any useful function to
the language.

In Modula-3 and Ada, all identifiers in a scope are really in the same
scope, and both the syntax and rules for looking up identifiers are
independent of the referencing context.  A drastic simplification.

Many kinds of template/generic units

Another source is the ability to have several kinds of template units
(functions, classes, member functions, member templates, etc.) and
template parameters.  Moreover, these are not necessarily tied to
separate compilation units.  Like C++, Ada also allows several kinds
of units, not necessarily separately compiled, to be generic,
instantiated, and in the case of procedures, generic parameters.

In Modula-3, only interfaces and modules can be generic, and only
interfaces can be generic parameters.  But since an interface or
module can contain any declarable entity, the result is every bit as
flexible, probably even more so.  Yet this greatly simplifies
generics.  It does come at a price of minor inconvenience by requiring
more very small, separately compiled interfaces and modules as
instantiations and more small interfaces as generic actuals.

As an aside, the ability to create instantiations using the build
system is only an added convenience, not mandated by the language.
You can always write Modula-3 code for all instantiations and generic
actuals, with no more burden added to the build system or its use by
than ordinary, non-generic interfaces and modules.  The Quake commands
we have in the implementations of Modula-3 merely generate common
cases of these almost trivial compilation units, as a quicker
alternative.

All instantiations are anonymous

Yet another source of complexity in C++ is the lack of named
instantiations.  Everywhere you want to refer to an instantiation, you
have to repeat is full structural definition, complete with repeats of
the template actuals.  This makes some code awfully pedantic.  It gets
worse when there are nested instantiations.  And, of course, it
complicates the language itself.

Aside from language complexity, this also creates a very difficult
problem for an implementation, because it must locate and combine all
the structurally-equivalent instantiations, even from separate
compilations.  This the worst source of very highly mangled linker
names.

In both Ada and Modula-3, you always declare a name for an
instantiation, attached to its definition, providing the generic
actuals exactly once here.  Then you use the name (a single
identifier) everywhere else.  If you don't want multiple copies of
structurally-equivalent instantiations to be compiled, you just take
care to code it this way, as is the case in virtually all other
constructs in all languages.  The implementation doesn't need to find
the duplicates.

Static checkability of template/generic units

One source of complexity in Ada's generics is Ada's policy that a
generic can be fully statically checked in the absence of any
instantiation.  The instantiations have some semantic rules that must
still be applied, but they don't involve anything but the normal kinds
of interface properties of the generic unit.  An instantiation cannot
cause internal semantic errors to pop up inside the generic.  (In Ada
83, there was one language bug in this property.  It was fixed in the
next version, Ada 99.)

This is a very nice property.  Unfortunately, it complicates the
language greatly.  On the one hand, there has to be an elaborate set
of rules about what kinds of generic parameters are allowed at all and
how different generic parameters of a single generic unit can be
interrelated.  The programmer then has to specify these relationships.
Inevitably, while this system of rules is complicated, it is also
incomplete.

C++ and Modula-3 postpone semantic checking until instantiation time,
which removes these limitations, at the cost that an instantiation
could mysteriously fail to compile if the actuals don't have the right
properties.  The compile errors will likely be deep inside the body of
the generic--where the instantiator would ideally not need to venture.
Modula-3 at least informally encourages the writers of generics to
write comments spelling out the necessary properties of the generic
actuals.  AFAIK, C++ does not.

This postponement sacrifices a very nice property for static checking.
In Modula-3, the sacrifice buys a lot of simplicity and some increased
flexibility.  C++ gives the worst of both worlds, as the same static
checking is lost, with no simplicity benefit.

Interaction with overloading

Both C++ and Ada allow user-defined overloading of both operators and
procedure/function identifiers.  This creates massive language complexity
in many ways.  A template/generic facility provides more things to
interact with overloading, heaping on more complexity.

-----------------------------------------------------------------------