[M3devel] cm3 llvm backend?
Jay K
jay.krell at cornell.edu
Tue Jun 28 20:12:53 CEST 2016
> The savings of keystrokes in not making the operations explicit is penny-wise and million-pound foolish.
I hear this a lot.
There is a flip side.
There is such a thing as too verbose, or explicitly saying everything all the time is inefficient.
In fact, if we had to say everything, all the time, we'd instantly run out of space.
Requiring too much verbosity can be tedious and hide the important details in the noise of obvious repeated stuff.
Some balance must be found.
Consider:
C:
a = b
vs.
asssembly hypothetical:
mov ra, rb
"=" has become "mov r r" -- a large multiplication of noise that the human reader
has to sift through to see the meaning
or worse
mov r1, rsp[b]
mov rsp[a], r1
Is the assembly clearer because it is more explicit?
operator= can be overloaded
It means assignment, however the author of the types of a and b deemed appropriate
it might be on the order of 1-3 instructions,
or it might be a function call.
Ignoring performance, it doesn't matter. It is the right notation for "assignment".
We must pick local vocabularies or "context" and communicate within it.
It must be easy for a newcomer to step back and learn the vocabulary -- that
is probably the unsolved part.
And then I often read code, that reads like:
// foo the bar
foo(bar)
I don't know what a "bar" is, I don't know what it means
to "foo" it, or why/when you would want to.
The pattern works in Modula-3, C, C++, etc.
even assembly
; foo the bar
mov rcx, rsp[bar]
call foo
all equally gibberish to the newcomer.
Though to the initiated, assuming foo is at least a few lines of code, this is probably the right level to code at.
- Jay
> Date: Tue, 28 Jun 2016 11:40:06 -0500
> From: rodney_bates at lcwb.coop
> To: jay.krell at cornell.edu; m3devel at elegosoft.com
> Subject: Re: [M3devel] cm3 llvm backend?
>
>
>
> On 06/27/2016 08:03 PM, Jay K wrote:
> > So..rambling a bit..but I have discussed some with people
> > and considered it.
> >
> > "the interface to the compiler backend"
> >
> >
> > and my half serious answer:
> >
> >
> > All of the compilers have a well documented very stable
> > interface to their backends, and it is in fact the same,
> > roughly, interface to all the backends: source code.
> >
> >
> > It is true it isn't the most efficient representation.
> > Maybe the least efficient.
> >
> >
> > It might not expose all the internals at least portable (e.g. tail call).
> > But it works, is heavily used, is well known, documented,
> > has high compatibility requirements, somewhat readable with
> > standard tools, etc.
> >
> >
> > I would advocate that C and C++ be evolved a little bit
> > for these purposes. In particular, C needs exception handling.
> > C and C++ need a tail call notion.
> > _alloca should maybe be standardized.
> > I should be able to generate image-relative pointers/offsets.
> > (trivial in Microsoft assembly with "imagerel"), to help
> > me layout position indepenent data structures.
> >
> >
> > C-- is kind of this, and there was some C-resembling assembly,'
> > but I think really C should be the starting point, as it is pretty close.
> >
> >
> > Probably need some extensions like non-int bitfields, and
> > rotate, and shift right with both sign copying and zero fill.
> >
> >
> > > A teacher of mine called this behavior "version junkie".
> >
> >
> > There are at least two big reasons for this.
> >
> > - The language really is improving. Programs
> > written in the newer language are easier to read
> > and often easier to optimize and sometimes easier
> > to implement a compiler for.
>
> Sometimes so, sometimes not. Sadly, many language "features" reflect
> an implicit but very misguided belief that fewer keystrokes/characters
> means increased readability. Or at least that writability is more
> important than readability. So often, this means actual code is less
> explicit. But this makes maintenance far worse.
>
> E.g., Ada decided use parentheses for both actual parameter lists to
> function calls and subscript lists to arrays. Along with optional
> implicit operations like dereferencing, there are somewhere in the
> teens of possible meanings for the innocent looking "f(x)". I have forgotten
> the exact number, but once had to do the semantic analysis. That was
> Ada 83. Maybe more have been added since. For the poor schmuck who
> gets called at 3:00 AM to fix a bug in half a million lines of code
> she didn't write, this is a readability disaster. The savings of
> keystrokes in not making the operations explicit is penny-wise and
> million-pound foolish.
>
> >
> > - Dogfooding -- using the language helps inform the
> > language implementer where they have done things right,
> > or wrong, and what to improve.
> >
> >
> > I believe in fact that under-dogfooding of C++ led
> > to some early omissions. The need for auto for example.
> > Granted, Stroustrup put it in in the 1980s and had to remove it.
> > But with more dogfooding by more implementers, it would
> > have been added earlier. Similarly "template typedefs".
> > Are obviously needed once you use templates for about a day.
> >
> >
> > Modula-3 has similar failings imho.
> > For example, the fact that with/var imply
> > a nesting that "needs" indentation and needs "end".
> > This is something that C++ and much later even C fixed.
> >
> >
> > Perhaps though, perhaps the Modula-3 designers were
> > balancing the specification and implementation against
> > user convenience, as the current design is obviously
> > simpler to specify and implement. But tedious to use.
> >
> > So the binary form of LLVM bit code is more stable than the text form?
> >
> >
> > - Jay
> >
> >
> >
> >
> > > Date: Mon, 27 Jun 2016 19:31:53 -0500
> > > From: rodney_bates at lcwb.coop
> > > To: m3devel at elegosoft.com
> > > Subject: Re: [M3devel] cm3 llvm backend?
> > >
> > >
> > >
> > > On 06/27/2016 03:29 PM, Henning Thielemann wrote:
> > > >
> > > > On Mon, 27 Jun 2016, Rodney M. Bates wrote:
> > > >
> > > >> And no, the names and operator spellings are not close to adequate
> > > >> to clue you in. They have gone to every length possible to use
> > > >> every clever new C++ "feature" that comes out in the latest
> > > >> C++-<n> standard, which always just increases the complexity
> > > >> of the search to a declaration. So I don't fancy doing any of
> > > >> this. (BTW, <n>=17 in recent discussions.)
> > > >
> > > > A teacher of mine called this behavior "version junkie".
> > > >
> > >
> > > Yes, yes.
> > >
> > > >
> > > >> Actually, keeping their bitcode stable across llvm releases is
> > > >> one place they do talk about compatibility. But m3llvm uses calls
> > > >> to llvm APIs to construct llvm IR as in-memory data, then another
> > > >> call to get llvm to convert it to bitcode. So bitcode's stability
> > > >> is irrelevant to us. I once thought about producing llvm bitcode
> > > >> directly, but that seems like a pretty big job. It would, however,
> > > >> obviate creating most of those wretched bindings.
> > > >
> > > > An alternative would be to create .ll text files. But its format changes, too.
> > >
> > > Yes. But, according to the list talk, they don't have the intention to
> > > make it as stable as the bitcode format.
> > >
> > >
> > > > _______________________________________________
> > > > M3devel mailing list
> > > > M3devel at elegosoft.com
> > > > https://m3lists.elegosoft.com/mailman/listinfo/m3devel
> > > >
> > >
> > > --
> > > Rodney Bates
> > > rodney.m.bates at acm.org
> > > _______________________________________________
> > > M3devel mailing list
> > > M3devel at elegosoft.com
> > > https://m3lists.elegosoft.com/mailman/listinfo/m3devel
>
> --
> Rodney Bates
> rodney.m.bates at acm.org
> _______________________________________________
> M3devel mailing list
> M3devel at elegosoft.com
> https://m3lists.elegosoft.com/mailman/listinfo/m3devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20160628/0e5f5feb/attachment-0003.html>
More information about the M3devel
mailing list