[M3devel] additional CVS repositories for additional gcc forks?
Jay K
jay.krell at cornell.edu
Sun Aug 29 02:21:31 CEST 2010
Ps we don't even have m3gdb for all systems e.g. Darwin, and the Windows debuggers are much better than anything I've seen on Unix. On these systems intermediate C would improve debugging much. Though Darwin gdb I've also been improving.
Also you seem to confuse C name mangling with what Modula-3 does. They are quite different. C only mangles things with linkage, for linking reasons, not for debugging information. Locals, parameters, record fields: no mangling. C code analogous to what Modula-3 allows would survive with everything being extern C, no name mangling.
In both cases as I understand, an effective hack to tunnel information through systems not quite designed/extended to suit.
What we have is flawed. What I favor is flawed. But differently.
- Jay/phone
> Date: Sat, 28 Aug 2010 14:15:02 -0500
> From: rodney_bates at lcwb.coop
> To: m3devel at elegosoft.com
> Subject: Re: [M3devel] additional CVS repositories for additional gcc forks?
>
>
>
> Jay K wrote:
> >
> >> There is no way a debugger that has no Modula-3 awareness is going to provide
> >> a Modula-3-like view. The operators will have C spellings and C semantics,
> >
> >
> > How many operators do people use in a debugger?
> >
> > I use very few. Partly because for a long time I used a debugger
> > with a great gui and an awful expression evaluator.
> >
> >
> > Still, I use basically only "+" "->" "*" (dereference) and "=" for assignment.
> > Sometimes multiplication and subtraction.
> > I agree it would be nice if all the C debuggers would be lenient about "->" vs. ".".
> > That would unify Modula-3, Java, C#, C, C++.
> > Except where C++ has an operator-> overload. But operator overload
> > is an area where.. tangent... C++ is a great language..my compiler implements
> > it well..but my debugger, my editor, plain text search.. can't cope with it.
> > Modula-3, C#, Java run afoul of plain text search too -- anything with prevalent "scoped names".
> > In C you get Window_Init, File_Open, etc. never just Init or Open.
> > How do you search for calls to operator+ in C++? For a certain type?
> > In C, except for the builtin types, they'd be unique function names.
> > Anyway, tangent over.
> >
> >
> > + is the same in the various languages.
> >
> > I think "=", ":=", "==" are the main problem.
> > You might try a compare and accidentally to an assignment.
> >
> >
> >> The syntax will be strictly C.
> >
> > Almost the same.
> >
> >
> > > The display of values will be C.
> >
> > Almost the same.
>
> On the strength of your comments, I rest my case.
>
> >
> > Also if you have a particularly good C compiler/debugger, we could do
> > #define AND &&
> > #define OR ||
> >
> >
> > getting you back those two operators, which I rarely use in a debugger.
> >
> >
> > > TEXT won't work in any reasonable way at all.
> >
> > Sure it might.
> > In Visual Studio you can write little addins to help the debugger display stuff.
> > I believe there is a small builtin "language" or I believe you can write actual code.
> > In Windbg you can write little plugins. You could provide like !m3.text.
> > I don't know if you can tell the debugger ahead of time how to custom display types.
> > I don't know if gdb has a story here.
> > Still, one might imagine a *small* patch to gdb.
>
> All of which is just different ways of providing a debugger with proper Modula-3 support.
>
> >
> >
> > > Demangling names in the compiler's debug output would make them look nice, but then the Modula-3
> > > type info would be lost, and output formats would lose.
> >
> >
> > Um, you think maybe this stuff was done the wrong way in the first place?
> > That the names shouldn't be mangled in the first place?
> > I strongly suspect so. Other systems don't depend on this.
> > (Yes, I know about C++ name mangling, and even though it does something similar,
> > that's a trick for the linker and now how debug information works. It for
> > in the absence of debug information, among other reasons.)
> >
>
> I think it could be done a lot better if switching to a better debug info format.
> stabs may have been the best option around at the time it was done. And I don't
> know if Modula-3's structural type equivalence rules could be supported any better
> without the uid's.
>
> But there will still have to be name mangling to get a standard linker to work,
> with just about any language. Stock gdb does what it does in part because it
> has builtin demanglers for the various languages it supports. It chooses the
> appropriate demangler dynamically. It would be a pretty ugly user interface if
> it didn't.
>
> > There is "naturally" type information you get just by building up decent gcc trees.
> > Ditto for intermediate C code.
> > For a while you know, every record is a void* or just has a size, and all the type information
> > is buried in the names. This is questionable. I'm sure it has some advantages.
> > You can describe things maybe not easily described in C.
> > e.g. Subranges?
> > And then our code in m3gdb is probably very portable, in that, I think, we just ferry along
> > some strings, from our code to our code, and we can decipher them the same in all systems.
> > I think, I'm not sure, there is like no dependence on the vagaries of coff, dwarf, etc., and
> > what they can or cannot represent. However there is a dependency on stabs being available.
> > It is not for example available on HP-UX.
> >
> >
> > Furthermore the lack of correct type information, apart from stabs, causes problems.
> > For some targets the backend wants accurate type information to pass records by value.
> > I again/still think we should probably not rely on the backend for this anyway.
> > We should probably make a copy and pass a pointer to it, kind of like m3x86 does.
> >
>
> Certainly, a back end needs type information for code generation. That won't do much to
> help a debugger that is oblivious to the source language.
>
> >
> > > Things that use pointers at the machine level can never know whether the pointers
> > > point to a single value or an array, and if the latter, with what bounds.
> >
> >
> > C programmers can cope with that. Can't we?
>
> Look at the security advisories. Buffer overflow, buffer overrun, buffer overflow, ...
> over and over. Almost all of them are buffer overruns. But that's a tangent too.
>
> > And..I admit.. I don't know what our machine level mapping looks like.
> > Do we pass a pointer and a size as two parameters? Or a small record with pointer/size by value?
> >
>
> If it's a fixed array, its all in the static type, which the stabs (as extended for Modula-3)
> info conveys. If it's an open array, there is runtime dope: a pointer to the zero-th array
> element, followed by a shape, which is just a list of words giving the NUMBER of each dimension.
> The dimension count is statically known. generally passed by reference, although it is never
> altered. For heap-allocated arrays, it's located right in the heap object, at the beginning,
> making the pointer redundant. For Formal parameters, if the actual doesn't already have the
> needed dope, it is constructed at runtime by code at the call site. This works for, e.g.,
> passing a fixed array or a SUBARRAY to a formal that is open.
>
> BTW, this is another thing a debugger has to know about to either pass, display, or alter
> open array values. Perhaps Dwarf is sophisticated enough that it could just be encoded in
> Dwarf, but certainly not in stabs. (Well, we could probably cobble up yet another stabs
> extension, but that would still require specialized debugger support.)
>
> >
> > The debugger need not be a full blown Modula-3 interpreter.
> >
> >
> >> Probably the worst thing will be calls. They just don't work without the debugger
> >> having knowledge of a lot of stuff. There are extra hidden parameters, method
> >> calls, passing procedure-typed parameters with environments, calling the same,
> >> the three modes of Modula-3, etc. I consider calls in debugger commands very
> >> valuable.
> >
> >
> > I use calls very rarely.
> > I'm not super keen on running some of my code when otherwise my code is all frozen
> > and some of it is misbehaving. I know this is partly me.
> >
> >
> > Even so, generally you only call certain functions that put there for use from a debugger, right?
> > Like gcc's debug_node or such?
> > And they tend to not be fancy?
>
> I regularly type a debugger call to rexecute something that I just stepped over, not knowing whether
> the problem I am looking for would occur inside the procedure or not. Reverse debugging in the newest
> gdb could provide an alternative, but I am hearing that the necessary recording costs ~ n*10 slowdown.
>
> I also do it as an easy way to test some parameter combination. Kind of like having an interpreter for
> the language. And I do it with fancy procedures that format some elaborate data structure in a readable,
> high-level way, which is, I think, what you meant. Sometimes a *lot* of effort.
>
>
> >
> > And the extra parameters..debugger would complain about missing them, programmer would figure it out?
>
> Before I got call support working, I found there were many calls I could not make work at all. Either
> I couldn't figure out what was needed (This amounts, in part, to manually reading stabs), or there
> was no way to supply what was needed in a debugger command. Also, on method calls, there was no way
> to figure out where it would dispatch to, even when you could locate all the possible overrides in all
> the subtypes, in all the source files of the closure. Language-specific support can take care of this,
> when done completely. m3gdb is a long way from there now, but it helps a lot.
>
> >
> >
> > I'm not saying there aren't drawbacks here.
> > But there are also major advantages.
> > There are major costs and drawbacks to our current approach.
> > We have a ton of extra code.
> > Which I don't think we are well equipped for.
> > Maybe Tony is. Maybe someone else is. I'm not.
> >
> >
> > Partly, I'll admit, anything I write, I am much more able to maintain.
> > Or, another lazy angle, anything smaller is easier to maintain.
> >
> >
> > In gcc we have a large code base. It takes me a long time to get just slightly up to speed on it.
> > We have several nagging problems with it. Maybe I just need to look at the C front end more.
> > Or read tree.h. I don't know.
> >
> >
> > 4.5.1 doesn't work with SPARC32_SOLARIS/SOLgnu/SOLsun.
> > 4.3.5 maybe not either.
> > A few optimizations I have turned off for 4.5.1 because they cause problems. Including inlining.
> > Maybe I just need to debug more.
> > Apple and OpenBSD each maintain their own forks. So that, *sort of but not really*, triples things.
> > (now, they are all highly related, so it doesn't) So far we don't have the OpenBSD fork.
> > But for example 4.5.1 doesn't have the OpenBSD/powerpc stuff quite. And there is a small OpenBSD/mips64
> > problem I worked around. Minor, I guess. We could just drop these platforms, or OpenBSD entirely.
> > Not a huge deal. But it is yet more stuff that C as an intermediate platform solves.
> > Exception handling stinks on all platforms but SOLsun/SOLgnu.
> > The ALPHA_OSF code no longer works. I tried.
> > Generating C/C++ would significantly improve exception handling nearly across the board.
> > It is possible otherwise, but much more difficult.
> >
> >
> > So, again, C as intermediate code isn't perfect or without drawbacks, but it promises:
> > greatly increased portability
> > more efficient exception handling
> > better codegen by letting the optimizer be full on, even acrosss modules with some C compilers (gcc 4.5, Visual C++, at least)
> > better debugging with stock debuggers (including Visual C++, windbg)
> > a portable distribution format -- no more having to distribute binaries, though they still have advantages
> > easier to get into the various "ports" systems I think as a result
> > a much smaller system overall (no GPL, if it matters)
> >
> >
> > Again, there are drawbacks, but it just seems so very tempting.
> >
> >
> > I'm sure I'll plug away at m3cc a while longer, but I think more and more it is questionable.
> >
> >
> > I can try again to read the LLVM stuff.
> >
> >
> > A new backend I think is unavoidably a lot of work, be it C or LLVM.
> > That's my hangup on both of these. It requires knowing a lot about two big things -- M3CG and the underlying generator.
> > parse.c is "only" 6,000 lines but pretty dense in terms of information gone into it. Maybe I'm just feeling dumb.
> >
> >
> > The thing about C though, is it is a very well understood next layer down. Certainly compared to
> > the gcc trees or LLVM. I don't think it is just me, that I'm some C expert.
> >
> >
> >
> >> This could probably be improved a lot by switching to a better debug info format,
> >> probably the latest Dwarf variant. But that is a big job.
> >
> >
> > I don't believe we have to do *anything* sort of to switch debug formats.
> > We just have to provide gcc with decently formed typeful trees.
> > It should do the rest.
> > Currently I guess it is all custom.
>
> It doesn't provide nearly enough information in stock form. The "stabs" it now produces has a lot
> of Modula-3-specific stuff crammed inside the fields of stabs entries. This had be be added by the
> original implementors of the gcc backend. A different debug format will need changes to gcc to
> emit what is needed. However, it might well be entirely within the "llanguage" of, say Dwarf, which
> is very general. It certainly would be a lot cleaner, and it could easily be completed in places
> that are now hard. So there would still be a lot of work.
>
> I don't completely understand where the current back end emits all the stabs stuff, but I believe
> all or almost all of it comes through code in parse.c calling utility code (dbxout.c, e.g.), and
> is not much taken from the trees gcc uses. This is why I have yet to figure out how to write
> correct debug info describing the locations of static links, since gcc develops this information
> by transforming trees, after parse.c has done its thing.
>
> >
> > I tried -g again, thinking maybe things were better now. It still crashes.
> > It seems related to the fact that _m3_fault is in an unknown location.
> > But that seems actually deliberate and reasonable, and I tried fiddling with it anyway.
> > No luck yet.
> >
> >
> > This debug format problem is also solved by using C intermediate code.
> > You just use -g or -gdb or -Zi or whatever, whatever is normal for C, and it'd just work.
> >
>
> Lots of things, TEXT being probably the worst, won't display in a Modula-3 form this way.
> And things in expressions/statements won't work either. A debugger user will have to understand
> a lot of low-level stuff about how the C back end translates Modula-3 code to C, to use it at all,
> and it will still be far less convenient.
>
> >
> > Anyway..
> > - Jay
> >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20100829/9eaa0ded/attachment-0002.html>
More information about the M3devel
mailing list