[M3devel] How to integrate llvm into cm3

Fri May 22 18:15:06 CEST 2015

On 05/21/2015 07:57 PM, Jay K wrote:
> Imho "all" options should be implemented, for purposes of convenient debugging/development of the backends.
>

Yes, I like that idea.  We should be able to do it without a huge coding burden.

>
> "external" is good for developing backends. You can "snapshot" the state of things
> slightly into the pipeline and then just iterate on later parts.
>

Yes.

>
> At the cost of having all the serialization code.
>

Yes, although it is already there now, for cm3-IR.  It exists in
llvm for llvm-IR, but has to be linked in.

>
> "integrated" is usually preferable for performance, for users.
>

Yes

>
> E.g. NTx86 backend has been sitting in there for decades unused by half the users.
>
>
> Having extra backends sitting in there unused is ok.
> Ideally, agreed, they'd be .dll/.sos if we can construct it that way, but ok either way imho.

Thank you.

> Ideally also cm3 would dynamically link to libm3/m3core, but it doesn't.
>
>
> Everything is demand paged so there is cost to distribute over the network
> and copy around, but at runtime, the pages just sit mostly cold on disk.
>

Hadn't thought of that.  It pretty well addresses my concerns about how much
to link in.  We can just be extravagant and let the paging take care of it.

>
> One difficulty though is the need to have and build the LLVM code.
> For that reason, delayload-dynamically-linked might be preferable.
> It depends on how small/easy-to-build LLVM is.
>

All of llvm is huge and takes a long time to build.  Both Peter and I
are currently following the approach of linking in only a subset of
llvm, containing code to build its internal trees and serialize them.
I haven't checked actual sizes of compiled code, but in library count,
it's perhaps 1/4 or so.  Of course, you still need the rest for the
separate executable that optimizes/codegens.

>
>   I guess LLVM provides more choices than before.
>   In order of efficiency and inverse order of debuggability:
>    1 We could construct LLVM IR in memory and run LLVM in-proc and write .o.
>    2 We could write out LLVM bytes and run an executable.
>    3 We could write out LLVM text and run an executable.
>
>

One consideration is that there is a *lot* of command-line processing, etc.
in driving llvm opt/codegen.  llc, a separate executable which comes as part
of llvm already provides this.  Doing 1 above would require duplicating this,
or maybe ripping out the C++ code and somehow repackaging it.  We would still
have to pass all these options through cm3.  The way I have Builder working
now, what of this stuff that is handled, comes from Quake code, etc. bypasses
cm3 and gets passed in to llc.

>   > My personal preference would be to only have one default target statically compiled in
>
>   It has never worked that away. Granted, we didn't really have backends before, just writing mainly IR.
>   And I don't think LLVM works that way, does it?
>
>   I like one compiler to have all targets and just select with a command line switch.
>
>   I don't like how hard it is to acquire various cross-toolschains.
>   Granted, we cheat and are incomplete -- you still need the next piece of the pipeline,
>   be it LLVM or m3cc (which has one target), or a C compiler or assembler or linker or "libc.a".
>
>
>   binutils at least has this "all" notion reasonably well working now I believe.
>
>
>
> There are tradeoffs though. If only one backend has a bug, and they are all statically linked together, you have to update them all.
> And the largely wasted bloat.
>
>
> Ultimately really, I'd like the C backend to output portable C and then just one C backend, one distribution .tar.gz for all targets.
> There is work to do there..not easy..and no progress lately.
> Things like INTEGER preserving flexibility in the output, and using sizeof(INTEGER) in expressions instead of using 4 or 8 and folding...
>
>
>
>   - Jay
>
>
>
>
>  > Date: Thu, 21 May 2015 20:13:18 +0200
>  > From: estellnb at elstel.org
>  > To: rodney.m.bates at acm.org; m3devel at elegosoft.com
>  > Subject: Re: [M3devel] How to integrate llvm into cm3
>  >
>  > Am 21.05.15 um 19:24 schrieb Rodney M. Bates:
>  > >
>  > > There are pros and cons. Integrating Peter's cm3-to-llvm conversion into
>  > > the cm3 executable would be faster compiling--one fewer time per
>  > > interface
>  > > or module for the OS to create a process and run an executable. But it
>  > > would also entail linking in this code, along with some of llvm's
>  > > infrastructure,
>  > > into cm3, making its executable bigger, with code that might not be
>  > > executed
>  > > at all, when a different backend is used. We already have the x86
>  > > integrated
>  > > backend and the C backend linked in to cm3, whether used or not.
>  > >
>  > > Anybody have thoughts on this? I suppose it could be set up to be fairly
>  > > easily changed either way too.
>  > >
>  >
>  > Why not put each backend into a shared library and load it dynamically?
>  > Are there still problems with shared libraries for some build targets?
>  > On the other hand having cm3-IR handy and being able to translate
>  > cm3-IR by an executable like m3cc into any desired target has proven
>  > to be very handy for debugging as well as chocking the Modula-3
>  > compiler on a new platform.
>  > My personal preference would be to only have one default target
>  > statically compiled in namely that on for cm3-IR and load all other
>  > targets by a shared libarary dynamically. If that should fail for some
>  > reason one can still use m3cc or one of its counterparts to
>  > accomplish the translation process.
>  >
>  > Elmar
>  >
>  >
>  >
>  >
>  >
>  >

-- 
Rodney Bates
rodney.m.bates at acm.org