[M3devel] CM3 crashing

Tony Hosking hosking at cs.purdue.edu
Mon Jan 13 17:03:38 CET 2014


Let us assume that the user-level threads are functioning properly w.r.to GC (can someone confirm?).
In which case, it would be good to have as many eyes as possible take a look at the differences between ThreadPosixC.c (ProcessContext) and ThreadPThreadC.c (ProcessLive) to see if we can spot a problem.

As I understand it, the crash occurs even for non-concurrent (@M3noincremental) and non-generational (@M3nogenerational) GC.  Combined with my assumption that user threads work fine, it would seem to point the finger at stack scanning.  Can someone confirm?

If the failure is *only* with concurrent or generational collection then we might suspect unsafe code (perhaps newly introduced?) messing with heap references without keeping the collector informed.

On Jan 12, 2014, at 10:54 PM, Peter McKinna <peter.mckinna at gmail.com> wrote:

> Hey,
> 
>   I was trying to get a handle on that problem last year. The threadtest program is really a stress tester of the collector/allocator with pthreads. If you run it with just the tests read and alloc you pretty much always get a crash. If you run them with paranoidgc it will crash in the heap checker. Tony thought it was a clear cut  problem of the roots of some ref not being found on a thread stack. I mucked around with code to get some output and the stacks looked ok to me but I could be wrong. All stacks are checked whilst the threads are blocked in a signal handler, the design of which looks fine as far as I can tell. This test is characterised by some slow threads (the read threads) and a bunch of fast threads (the alloc threads). Even if you modify the test to have only one read thread the problem occurs. I have had misgivings about mixing signals and threads having been bitten many years ago, but really this is the only way the collector can get its raw refs to check. 
> My guess is its a subtle timing or lock problem maybe a lurking bug in the collector itself. One thing I noticed is that in ThreadPThread__ProcessStopped  the second call to   p(context, ((char *)context) + sizeof(ucontext_t)); according to the comment is to process the registers. But the registers should already be on the stack and anyway this call is a partial duplicate of the previous one.
> It would be good to raise the priority on this problem. Trust in the collector has always been at the heart of m3 programs. 
> 
> Regards Peter
> 
> 
> 
> On Mon, Jan 13, 2014 at 11:25 AM, <mika at async.caltech.edu> wrote:
> Yes it works in PM3 (still, since I use PM3 on FreeBSD4, never saw a reason to switch to CM3).
> 
> I figured there aren't actually implementations of Word.Xor, Word.Or,
> and Word.And as procedures but they get inlined somehow?
> 
> BTW, any idea about what's wrong with pthreads?  Do you think the issue
> is with the garbage collector or with the threads, off the top of your
> head?
> 
> Tony Hosking writes:
> >Are you saying passing these used to work in PM3?
> >Sounds like a front-end bug.  I=92m curious what changed to break it.
> >
> >On Jan 12, 2014, at 12:58 PM, mika at async.caltech.edu wrote:
> >
> >>=20
> >> The code is:
> >>=20
> >>              HIntExpr.Xor =3D> RETURN NewConst(CBitwise(av, bv, =
> >Word.Xor), ab)
> >>            |
> >>              HIntExpr.Bor =3D> RETURN NewConst(CBitwise(av, bv, =
> >Word.Or), ab)
> >>            |
> >>              HIntExpr.Band =3D> RETURN NewConst(CBitwise(av, bv, =
> >Word.And), ab)
> >>=20
> >> I guess it doesn't like passing Word.Xor, Word.Or, and Word.And ...=20
> >
> >
> >
> >Antony Hosking | Associate Professor | Computer Science | Purdue =
> >University
> >305 N. University Street | West Lafayette | IN 47907 | USA
> >Mobile +1 765 427 5484
> >
> >
> >
> >
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20140113/202c6d98/attachment-0002.html>


More information about the M3devel mailing list