[M3devel] How to integrate llvm into cm3

Elmar Stellnberger estellnb at elstel.org
Fri May 22 12:53:14 CEST 2015


Am 22.05.2015 um 12:16 schrieb dirk muysers:

> >> What about the said platform dependencies you have discovered?
>  
> Not me (I never seriously considered using it), but many people on the llvm
> forums pointed to the fact. One example among
> many:
> 
> Does your C code ever use the 'long' type? If so, the LLVM IR will be
> different depending on whether it's targeting linux-32 or linux-64. Do
> you ever use size_t? Same problem. Do you ever use a union containing
> both pointers and integers? See above. In principle, it's possible to
> write platform-independent IR, or even C code that compiles to
> platform-independent IR. In practice, especially if you include any
> system headers, it's remarkably hard.
> (Jeffrey Yasskin jyasskin at google.com)

Concerning me I am a very conscientious programmer when it comes to
make a difference between long, long long and int. I only use long if my
code requires a data item to be exactly as large as a pointer (in special
cases also when it comes to tap the power of 64bit machines, f.i. that
might be either 32/64bit as a base type for arbitrary length integers; 
however not without taking special provisions that will tackle the 
difference in data size. ). Usually aligning the pointers for the next 
structure at the beginning would also solve such an issue when it comes
to reuse existing code where data sizes may not be changed from long 
either to int or long long without special consideration. Those who use 
glib f.i. additionally have a g[u]int32/64 which they can use instead of int 
/ long long though that should at last never make a difference for Intel x86 
based systems. So when it comes to use int or long long I mostly rely
on them being either 32 or 64bit.
I know that most programmers do not care and just always use long which 
I consider to be a particularly bad practice. Even in the Linux kernel they
have declared "typedef long time_t" instead of "typedef long long time_t"
which will create an Y2K mess all over in 2038 for all 32bit machines still
in use then. A somehow bad decision which needs to be changed sooner
or later even without llvm.

Now let us think of Modula-3. I believe we had a long type for cm3 the last
time I have seen it. However an equivalent to long long which does also 
exist on 32bit platforms would be an absolute requirement to not break 
things for llvm! Many Thanks for notifying us about this issue, Dirk.

As far as I can see a Modula-3 programmer will need a good core for
portable programming anyway as we did not even uphold a guarantee for
WIDECHAR to be either 16 or 32bit.

 




>  
> And then, besides the IR proper, there is that steadily increasing
> legion of intrinsics.
>  
> Unless you translate C-like code and build upon the existing technical
> LLVM heritage, je vous souhaite bien du plaisir as the French say...
>  
> From: Elmar Stellnberger
> Sent: Friday, May 22, 2015 11:49 AM
> To: dirk muysers
> Subject: Re: [M3devel] How to integrate llvm into cm3
>  
>  
> Am 22.05.2015 um 10:48 schrieb dirk muysers:
> 
>> Personally I have a strong dislike towards LLVM.
>> 1. You first have to compile the whole tool chain.
>> 2. It is a monstrous blob of code, mainly on Windows.
>> 3. Contrary to a widespread belief, It is definitely NOT platform independent.
>> 4. It changes at every release.
>> 5. Having built your objects, you still have to run them through a platform assembler-linker.
>>  
>  
> Is it really that bad? What about the said platform dependencies you have discovered?
> I believe llvm could be beneficial in deed when it comes to debugging and/or analyzing Modula-3 programs,
> as there are tools like SAFECode and to my knowledge we never had a fully featured m3gdb.
> Besides this I would hardly like to believe that llvm is still that volatile when it comes to changes.
> I know it had some issues in its first days but I can hardly believe that qt5 on MacOS would rely on clang/llvm
> if that were not a ready to use technology nowadays. I would hope the main changes to llvm had already
> been done when Apple started to adopt llvm for its own needs.
> Concerning the code size of llvm that should not be a problem as long as it remains a separate module
> compiling into an own executable or a shared library loaded in addition to other backends at runtime.
>  
> 
>> If I still had the energy of my younger years I would try to pack the platform
>> dependent part of the libraries into a dynamic load library together with a JIT
>> translator (e.g. libjit) for the portable application code and have a single byte-code
>> producing compiler backend.
>>  
>> From: Jay K
>> Sent: Friday, May 22, 2015 2:57 AM
>> To: Elmar Stellnberger ; rodney.m.bates at acm.org ; m3devel
>> Subject: Re: [M3devel] How to integrate llvm into cm3
>>  
>> Imho "all" options should be implemented, for purposes of convenient debugging/development of the backends.
>>  
>> 
>> "external" is good for developing backends. You can "snapshot" the state of things
>> slightly into the pipeline and then just iterate on later parts.
>>  
>> 
>> At the cost of having all the serialization code.
>>  
>> 
>> "integrated" is usually preferable for performance, for users.
>>  
>> 
>> E.g. NTx86 backend has been sitting in there for decades unused by half the users.
>>  
>> 
>> Having extra backends sitting in there unused is ok.
>> Ideally, agreed, they'd be .dll/.sos if we can construct it that way, but ok either way imho.
>> Ideally also cm3 would dynamically link to libm3/m3core, but it doesn't.
>>  
>> 
>> Everything is demand paged so there is cost to distribute over the network
>> and copy around, but at runtime, the pages just sit mostly cold on disk.
>>  
>> 
>> One difficulty though is the need to have and build the LLVM code.
>> For that reason, delayload-dynamically-linked might be preferable.
>> It depends on how small/easy-to-build LLVM is.
>>  
>> 
>> I guess LLVM provides more choices than before. 
>> In order of efficiency and inverse order of debuggability: 
>>   1 We could construct LLVM IR in memory and run LLVM in-proc and write .o. 
>>   2 We could write out LLVM bytes and run an executable. 
>>   3 We could write out LLVM text and run an executable.
>>  
>> 
>> > My personal preference would be to only have one default target statically compiled in
>>  
>> It has never worked that away. Granted, we didn't really have backends before, just writing mainly IR.
>> And I don't think LLVM works that way, does it?
>>  
>> I like one compiler to have all targets and just select with a command line switch.
>>  
>> I don't like how hard it is to acquire various cross-toolschains.
>> Granted, we cheat and are incomplete -- you still need the next piece of the pipeline,
>> be it LLVM or m3cc (which has one target), or a C compiler or assembler or linker or "libc.a".
>>  
>> 
>> binutils at least has this "all" notion reasonably well working now I believe.
>>  
>>  
>>  
>> There are tradeoffs though. If only one backend has a bug, and they are all statically linked together, you have to update them all.
>> And the largely wasted bloat.
>>  
>>  
>> Ultimately really, I'd like the C backend to output portable C and then just one C backend, one distribution .tar.gz for all targets.
>> There is work to do there..not easy..and no progress lately.
>> Things like INTEGER preserving flexibility in the output, and using sizeof(INTEGER) in expressions instead of using 4 or 8 and folding...
>>  
>>  
>> 
>> - Jay
>> 
>> 
>> 
>> 
>> > Date: Thu, 21 May 2015 20:13:18 +0200
>> > From: estellnb at elstel.org
>> > To: rodney.m.bates at acm.org; m3devel at elegosoft.com
>> > Subject: Re: [M3devel] How to integrate llvm into cm3
>> > 
>> > Am 21.05.15 um 19:24 schrieb Rodney M. Bates:
>> > >
>> > > There are pros and cons. Integrating Peter's cm3-to-llvm conversion into
>> > > the cm3 executable would be faster compiling--one fewer time per 
>> > > interface
>> > > or module for the OS to create a process and run an executable. But it
>> > > would also entail linking in this code, along with some of llvm's 
>> > > infrastructure,
>> > > into cm3, making its executable bigger, with code that might not be 
>> > > executed
>> > > at all, when a different backend is used. We already have the x86 
>> > > integrated
>> > > backend and the C backend linked in to cm3, whether used or not.
>> > >
>> > > Anybody have thoughts on this? I suppose it could be set up to be fairly
>> > > easily changed either way too.
>> > >
>> > 
>> > Why not put each backend into a shared library and load it dynamically?
>> > Are there still problems with shared libraries for some build targets?
>> > On the other hand having cm3-IR handy and being able to translate
>> > cm3-IR by an executable like m3cc into any desired target has proven
>> > to be very handy for debugging as well as chocking the Modula-3
>> > compiler on a new platform.
>> > My personal preference would be to only have one default target
>> > statically compiled in namely that on for cm3-IR and load all other
>> > targets by a shared libarary dynamically. If that should fail for some
>> > reason one can still use m3cc or one of its counterparts to
>> > accomplish the translation process.
>> > 
>> > Elmar
>> > 
>> > 
>> > 
>> > 
>> > 
>> >
> 
>  

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20150522/2ee92f78/attachment-0002.html>


More information about the M3devel mailing list