[M3devel] additional CVS repositories for additional gcc forks?

Jay K jay.krell at cornell.edu
Sat Aug 28 10:40:29 CEST 2010



>
> There is no way a debugger that has no Modula-3 awareness is going to provide
> a Modula-3-like view. The operators will have C spellings and C semantics,


How many operators do people use in a debugger?

I use very few. Partly because for a long time I used a debugger
with a great gui and an awful expression evaluator.


Still, I use basically only "+" "->" "*" (dereference) and "=" for assignment.
 Sometimes multiplication and subtraction.
I agree it would be nice if all the C debuggers would be lenient about "->" vs. ".".
  That would unify Modula-3, Java, C#, C, C++.
   Except where C++ has an operator-> overload. But operator overload
   is an area where.. tangent... C++ is a great language..my compiler implements
   it well..but my debugger, my editor, plain text search.. can't cope with it.
   Modula-3, C#, Java run afoul of plain text search too -- anything with prevalent "scoped names".
   In C you get Window_Init, File_Open, etc. never just Init or Open.
   How do you search for calls to operator+ in C++? For a certain type?
   In C, except for the builtin types, they'd be unique function names.
  Anyway, tangent over.


+ is the same in the various languages.

I think "=", ":=", "==" are the main problem.
You might try a compare and accidentally to an assignment.


> The syntax will be strictly C.

Almost the same.


 > The display of values will be C.

Almost the same.

Also if you have a particularly good C compiler/debugger, we could do
  #define AND &&  
  #define OR ||  


getting you back those two operators, which I rarely use in a debugger.


 > TEXT won't work in any reasonable way at all.

Sure it might.
In Visual Studio you can write little addins to help the debugger display stuff.
I believe there is a small builtin "language" or I believe you can write actual code.
In Windbg you can write little plugins. You could provide like !m3.text.
I don't know if you can tell the debugger ahead of time how to custom display types.
I don't know if gdb has a story here.
Still, one might imagine a *small* patch to gdb.


 > Demangling names in the compiler's debug output would make them look nice, but then the Modula-3
 > type info would be lost, and output formats would lose.


Um, you think maybe this stuff was done the wrong way in the first place?
 That the names shouldn't be mangled in the first place?
  I strongly suspect so. Other systems don't depend on this.
   (Yes, I know about C++ name mangling, and even though it does something similar,
   that's a trick for the linker and now how debug information works. It for
   in the absence of debug information, among other reasons.)

There is "naturally" type information you get just by building up decent gcc trees.
Ditto for intermediate C code.
For a while you know, every record is a void* or just has a size, and all the type information
is buried in the names. This is questionable. I'm sure it has some advantages.
You can describe things maybe not easily described in C.
  e.g. Subranges?
And then our code in m3gdb is probably very portable, in that, I think, we just ferry along
some strings, from our code to our code, and we can decipher them the same in all systems.
I think, I'm not sure, there is like no dependence on the vagaries of coff, dwarf, etc., and
what they can or cannot represent. However there is a dependency on stabs being available.
It is not for example available on HP-UX.


Furthermore the lack of correct type information, apart from stabs, causes problems.
For some targets the backend wants accurate type information to pass records by value.
I again/still think we should probably not rely on the backend for this anyway.
We should probably make a copy and pass a pointer to it, kind of like m3x86 does.


 > Things that use pointers at the machine level can never know whether the pointers
 > point to a single value or an array, and if the latter, with what bounds.


C programmers can cope with that. Can't we?
And..I admit.. I don't know what our machine level mapping looks like.
Do we pass a pointer and a size as two parameters? Or a small record with pointer/size by value?


The debugger need not be a full blown Modula-3 interpreter.


> Probably the worst thing will be calls. They just don't work without the debugger
> having knowledge of a lot of stuff. There are extra hidden parameters, method
> calls, passing procedure-typed parameters with environments, calling the same,
> the three modes of Modula-3, etc. I consider calls in debugger commands very
> valuable.


I use calls very rarely.
I'm not super keen on running some of my code when otherwise my code is all frozen
and some of it is misbehaving. I know this is partly me.


Even so, generally you only call certain functions that put there for use from a debugger, right?
Like gcc's debug_node or such?
And they tend to not be fancy?

And the extra parameters..debugger would complain about missing them, programmer would figure it out?


I'm not saying there aren't drawbacks here.
But there are also major advantages.
There are major costs and drawbacks to our current approach.
  We have a ton of extra code.
  Which I don't think we are well equipped for.
  Maybe Tony is. Maybe someone else is. I'm not.
  

Partly, I'll admit, anything I write, I am much more able to maintain.
Or, another lazy angle, anything smaller is easier to maintain.


In gcc we have a large code base. It takes me a long time to get just slightly up to speed on it.
We have several nagging problems with it. Maybe I just need to look at the C front end more.
Or read tree.h. I don't know.


 4.5.1 doesn't work with SPARC32_SOLARIS/SOLgnu/SOLsun. 
 4.3.5 maybe not either. 
 A few optimizations I have turned off for 4.5.1 because they cause problems. Including inlining. 
 Maybe I just need to debug more. 
 Apple and OpenBSD each maintain their own forks. So that, *sort of but not really*, triples things.
   (now, they are all highly related, so it doesn't) So far we don't have the OpenBSD fork.
  But for example 4.5.1 doesn't have the OpenBSD/powerpc stuff quite. And there is a small OpenBSD/mips64
   problem I worked around. Minor, I guess. We could just drop these platforms, or OpenBSD entirely.
  Not a huge deal. But it is yet more stuff that C as an intermediate platform solves.
  Exception handling stinks on all platforms but SOLsun/SOLgnu.
   The ALPHA_OSF code no longer works. I tried.
  Generating C/C++ would significantly improve exception handling nearly across the board.
   It is possible otherwise, but much more difficult.


So, again, C as intermediate code isn't perfect or without drawbacks, but it promises:
  greatly increased portability 
  more efficient exception handling 
  better codegen by letting the optimizer be full on, even acrosss modules with some C compilers (gcc 4.5, Visual C++, at least) 
  better debugging with stock debuggers (including Visual C++, windbg) 
  a portable distribution format -- no more having to distribute binaries, though they still have advantages 
     easier to get into the various "ports" systems I think as a result 
  a much smaller system overall (no GPL, if it matters) 


Again, there are drawbacks, but it just seems so very tempting.


I'm sure I'll plug away at m3cc a while longer, but I think more and more it is questionable.


I can try again to read the LLVM stuff.


A new backend I think is unavoidably a lot of work, be it C or LLVM.
That's my hangup on both of these. It requires knowing a lot about two big things -- M3CG and the underlying generator.
parse.c is "only" 6,000 lines but pretty dense in terms of information gone into it. Maybe I'm just feeling dumb.


The thing about C though, is it is a very well understood next layer down. Certainly compared to
the gcc trees or LLVM. I don't think it is just me, that I'm some C expert.



> This could probably be improved a lot by switching to a better debug info format,
> probably the latest Dwarf variant. But that is a big job.


I don't believe we have to do *anything* sort of to switch debug formats.
We just have to provide gcc with decently formed typeful trees.
It should do the rest.
Currently I guess it is all custom.

I tried -g again, thinking maybe things were better now. It still crashes.
It seems related to the fact that _m3_fault is in an unknown location.
But that seems actually deliberate and reasonable, and I tried fiddling with it anyway.
No luck yet.


This debug format problem is also solved by using C intermediate code.
You just use -g or -gdb or -Zi or whatever, whatever is normal for C, and it'd just work.


Anyway..
 - Jay
 		 	   		  


More information about the M3devel mailing list