[M3devel] multi-threaded m3front?

Darko Volaric lists at darko.org
Tue Aug 11 21:32:37 CEST 2015


Jay I can see the complexity of the task but can we have an option for
users who cannot invest time in this?

What would be great is if you could set up the scripts in a static
environment available to all that builds various platforms and back ends
that are requested or that you feel like maintaining. Having it so it could
auto build based on a particular version from github would be optimal.

Surely this would help with developing and debugging your scripts and
backend and would solve the problem of other people getting the environment
right.

I have dozens of servers (x86 and ARM) with ample memory and storage plus
PDUs that allow powering the servers up and down over the net. If you want
to set up the environments and scripts I can setup the access and install
Mac OS and whatever Unixes on some of them. You can have carte blanche as
to what you do with them, I don't care as long as working compilers pop out
somewhere.

Maybe others are interested in helping with this too.



On Tue, Aug 11, 2015 at 11:56 AM, Jay K <jay.krell at cornell.edu> wrote:

>  The problem with cross compiling is you need:
>   a working cross-compiler for C/C++
>   a working cross-assembler
>   a working cross-linker
>
>
>    If you have those, it is fairly easy.  *
>    Getting those is often not easy.
>    Esp. the C/C+ compiler as you need headers/libraries/startup code.
> "sysroot" or "buildroot"
>    it is often called.
>
>
>  * fairly easy:
>   for NT386 -- nothing extra to do, the backend is in cm3
>   There are endianness bugs in mklib, but it works on the majority of
> systems -- little endian hosts. I can fix it.
>
>   For the C backend -- nothing extra to do, the backend is in cm3.
>
>   For the gcc-based backend -- fairly easy -- you need to rebuild a
> different cm3cg
>   and point the config at it. It kind of is already setup to work.
>
>
>   For the LLVM backend, probably also easy.
>
>
>  I think on Debian and Gentoo, cross C compilers at least to every Linux
> architecture are readily available.
>  But I don't know about otherwise.
>
>
>  In as much as the compiler is gcc, the assembler GNU as, and the linker
> GNU ld, that gets you far,
>  but you still need the headers/libraries/startup code.
>
>
>   If your C compiler is LLVM, that might also help for
> compiler/assembler/linker.
>   But again, the headers/libraries/startup code.
>
>
>   There are various systems out there that try to automate this but I
> don't have much experience
>   with them, and they often are slightly different than what we want.
>
>
>  Otherwise, what we have setup is we cross compile to assembly files, copy
> them to the target,
>  which typically does have the compiler/assembler/linker, and finish there.
>
>
>  What remains there is that I only have this automated to build cm3.
>  It should also at least also build m3core and libm3 statically. And
> therefore reach
>  approximately parity with the 3.6 "boot" archives and install process.
>
>
>  Has anyone here installed 3.6? And consider there method a decent goal
> and stopping point?
>  Today is slightly different since quake has been rewritten from C to
> Modula-3 in the 4.0 timeframe.
>
>
>  And *possibly* boot archives should contain everything -- going beyond
> what 3.6 did.
>
>
>
>  This is just a matter of work in scripts/python/pylib.py.
>  Copying stuff into sub directories and generating a hierarhical recursive
> or not make system in there.
>  My make skills aren't great, and I've been hung up on details like
> depending on GNU make or trying to
>  be more portable.
>
>
> On the threading area, I believe there is a simple almost ideal design.
>
>  - parse and mostly "execute" all the quake code almost unchanged, single
> thread
>
>  - difference starts at about the last line of quake where it says library
> or program
>
>  - possibly though queueing every file to a "prefetcher" thread, it just
> reads
>    every file into a reused buffer and throws out the data, populating the
> operating
>    system's file system cache; or possibly mmaping every file and keeping
> it mapped,
>    and touching every page; doing this in a somewhat thread aware/safe way
> for the
>    upcoming actual accesses
>
>  - once quake is done, one of two choices:
>
>  simple but not optimal: do a single threaded parse of every interface
>  less simple: parsing here can be in parallel, depending on the dependency
> graph;
>  you'd just start up a thread per interface, but block on parsing its
> dependencies;
>  as you get away from the root (RT0.i3), the tree should get ever wider
>
>
>  You would either manually throttle the number of threads, or rely on an
> underlying threadpool.
>
>
>  NOTE: Modula-3 used to have a thread pool on Win32 but I removed it to be
> simplar.
>  Maybe that wasn't good. Simpler as in, including, thread id is now just
> the underlying
>  Win32 thread id. It wasn't before.
>
>  Win32 has had a thread pool since circa Windows 2000. That might be
> profitable to use.
>
>  - once parsing interfaces is done, I believe codegen of every interface,
> and parsing and codegen
>   of every module can proceed with arbitrary parallelism
>
>
>  Fetching interface/module contents might synchronize with the prefetcher
> depending
>  on which of the two approaches -- if the prefetcher is just prefetch and
> discard,
>  populating the file system cache, then later compilation just refetches
> oblibviously/unchanged.
>  If the prefetcher thread mmaps and keeps everything, then serialize with
> it.
>
>
>  Also, you might have multiple prefetcher threads. What you really want
> is..difficult.
>  You want a prefetcher thread per spindle. How to count them up?
>  If you have an SSD, then prefetching might not buy much and just forget
> about it.
>  The thing is, if you have spindles, this might be the most gain, and if
> you have an SSD,
>  the cost might be negligible or slightly beneficial.
>
>
>  I/O is so often the bottleneck, at least in the days of rotating storage.
>
>
>  In the event that the file system cache is small, or the source for a
> package
>  really large, prefetching can be counterproductive -- moving stuff
> through the cache
>  only to flush it before it is used and fetching it a second time.
>
>
>  I expect on most systems this is not a concern though.
>
>
>  One of the pieces of work here, I believe, is to move most globals in
> m3front
>  to be in Module.T. This would actually make things cleaner imho.
>  Yes, it consolidates knowledge that you might want distributed.
>  And accessing context'ed data is slightly slower than globals.
>  But imho globals are bad and everything should be placed in *some* context
>  other than the process or thread.
>
>
>  I would not advocate compiling functions within a module in parallel,
> only modules.
>
>
>  Similarly, code generators must have no globals, and put all the state in
> the cg object.
>
>
>  - Jay
>
>
>
>
> ------------------------------
> Date: Tue, 11 Aug 2015 10:20:26 -0700
> From: lists at darko.org
> To: wagner at elegosoft.com
> CC: m3devel at elegosoft.com; jay.krell at cornell.edu
> Subject: Re: [M3devel] multi-threaded m3front?
>
> I couldn't agree with you more. I think being able to compile or cross
> compile the system without spending hours (or days) hacking
> scripts/environments would be a huge step forward for the project. Or
> having compiler binaries for more than two platforms (four if you count
> different word sizes). Meanwhile I've never heard anyone complain that the
> compiler is too slow.
>
> But of course everyone has different priorities, different interests and
> different opinions in a volunteer project so any discussion on this subject
> will inevitably boil down to "he who writes the code determines the
> priorities."
>
>
>
>
> On Tue, Aug 11, 2015 at 12:13 AM, Olaf Wagner <wagner at elegosoft.com>
> wrote:
>
> On Mon, 10 Aug 2015 22:00:17 -0700
> Darko Volaric <lists at darko.org> wrote:
>
> > A more fruitful approach might be pipelining the compiler:
> >
> > - files enter the pipeline in reverse dependency order
> > - have a thread (stage) to read those files into memory
> > - have a thread to tokenize and parse the files and form the data
> structure
> > - have a thread to do intermediary processing
> > - have a thread to generate the output representation for the back end
> >
> > You'll only get a maximum 4x speedup and you'll only be as fast as the
> > slowest thread, but you can reliably tune the divisions based on simple
> > benchmarking of each thread. You're very likely to get somewhat close to
> > the 4x speedup. This works best when there is a 1:1 between pipeline
> stages
> > and cores. If you were ambitious you could attempt an 8 stage pipeline
> for
> > 8 core processors.
> >
> >
> > On Sat, Aug 8, 2015 at 8:05 PM, Jay K <jay.krell at cornell.edu> wrote:
> >
> > > Is anyone interested in updating m3front to be multi-threaded?
> > >
> > > I haven't seen a single core system in a while.
> > >
> > > Surely each module can be compiled separately, possibly with some
> > > serialization around compiling interfaces?
>
> While this is all correct, I'd like to remark that in my expecience
> optimizing for performance has always got in the way of clear,
> understandable and maintainable code.
>
> So while there are semantic refactorings and feature additions in the
> pipeline I would not like to see anybody rework the whole compiler
> with the intention of making it faster. I would rather see the few
> active programmers interested in M3 concentrate on more backends, better
> intermediate code, better code optimization, better maintainability
> etc.
>
> Olaf
> --
> Olaf Wagner -- elego Software Solutions GmbH -- http://www.elegosoft.com
>                Gustav-Meyer-Allee 25 / Gebäude 12, 13355 Berlin, Germany
> phone: +49 30 23 45 86 96  mobile: +49 177 2345 869  fax: +49 30 23 45 86
> 95
> Geschäftsführer: Olaf Wagner | Sitz: Berlin
> Handelregister: Amtsgericht Charlottenburg HRB 77719 | USt-IdNr:
> DE163214194
>
>
>
> _______________________________________________ M3devel mailing list
> M3devel at elegosoft.com
> https://mail.elegosoft.com/cgi-bin/mailman/listinfo/m3devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20150811/d8ef74ba/attachment-0002.html>


More information about the M3devel mailing list