[M3devel] CM3 crashing
Tony Hosking
hosking at cs.purdue.edu
Mon Jan 13 17:03:38 CET 2014
Let us assume that the user-level threads are functioning properly w.r.to GC (can someone confirm?).
In which case, it would be good to have as many eyes as possible take a look at the differences between ThreadPosixC.c (ProcessContext) and ThreadPThreadC.c (ProcessLive) to see if we can spot a problem.
As I understand it, the crash occurs even for non-concurrent (@M3noincremental) and non-generational (@M3nogenerational) GC. Combined with my assumption that user threads work fine, it would seem to point the finger at stack scanning. Can someone confirm?
If the failure is *only* with concurrent or generational collection then we might suspect unsafe code (perhaps newly introduced?) messing with heap references without keeping the collector informed.
On Jan 12, 2014, at 10:54 PM, Peter McKinna <peter.mckinna at gmail.com> wrote:
> Hey,
>
> I was trying to get a handle on that problem last year. The threadtest program is really a stress tester of the collector/allocator with pthreads. If you run it with just the tests read and alloc you pretty much always get a crash. If you run them with paranoidgc it will crash in the heap checker. Tony thought it was a clear cut problem of the roots of some ref not being found on a thread stack. I mucked around with code to get some output and the stacks looked ok to me but I could be wrong. All stacks are checked whilst the threads are blocked in a signal handler, the design of which looks fine as far as I can tell. This test is characterised by some slow threads (the read threads) and a bunch of fast threads (the alloc threads). Even if you modify the test to have only one read thread the problem occurs. I have had misgivings about mixing signals and threads having been bitten many years ago, but really this is the only way the collector can get its raw refs to check.
> My guess is its a subtle timing or lock problem maybe a lurking bug in the collector itself. One thing I noticed is that in ThreadPThread__ProcessStopped the second call to p(context, ((char *)context) + sizeof(ucontext_t)); according to the comment is to process the registers. But the registers should already be on the stack and anyway this call is a partial duplicate of the previous one.
> It would be good to raise the priority on this problem. Trust in the collector has always been at the heart of m3 programs.
>
> Regards Peter
>
>
>
> On Mon, Jan 13, 2014 at 11:25 AM, <mika at async.caltech.edu> wrote:
> Yes it works in PM3 (still, since I use PM3 on FreeBSD4, never saw a reason to switch to CM3).
>
> I figured there aren't actually implementations of Word.Xor, Word.Or,
> and Word.And as procedures but they get inlined somehow?
>
> BTW, any idea about what's wrong with pthreads? Do you think the issue
> is with the garbage collector or with the threads, off the top of your
> head?
>
> Tony Hosking writes:
> >Are you saying passing these used to work in PM3?
> >Sounds like a front-end bug. I=92m curious what changed to break it.
> >
> >On Jan 12, 2014, at 12:58 PM, mika at async.caltech.edu wrote:
> >
> >>=20
> >> The code is:
> >>=20
> >> HIntExpr.Xor =3D> RETURN NewConst(CBitwise(av, bv, =
> >Word.Xor), ab)
> >> |
> >> HIntExpr.Bor =3D> RETURN NewConst(CBitwise(av, bv, =
> >Word.Or), ab)
> >> |
> >> HIntExpr.Band =3D> RETURN NewConst(CBitwise(av, bv, =
> >Word.And), ab)
> >>=20
> >> I guess it doesn't like passing Word.Xor, Word.Or, and Word.And ...=20
> >
> >
> >
> >Antony Hosking | Associate Professor | Computer Science | Purdue =
> >University
> >305 N. University Street | West Lafayette | IN 47907 | USA
> >Mobile +1 765 427 5484
> >
> >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20140113/202c6d98/attachment-0002.html>
More information about the M3devel
mailing list