[M3devel] AutoFlushWr race.

Jay K jay.krell at cornell.edu
Mon Dec 7 13:36:02 CET 2009


Thank you for looking into stuff Peter. We really need more help.

Can you try again either with head or the release commit I just did? Thanks.

 

See:

http://modula3.elegosoft.com/cgi-bin/cvsweb.cgi/cm3/m3-libs/libm3/src/rw

http://modula3.elegosoft.com/cgi-bin/cvsweb.cgi/cm3/m3-libs/libm3/src/rw/AutoFlushWr.m3

etc.

 

Thanks,

 - Jay
 


Date: Mon, 7 Dec 2009 22:17:50 +1100
From: peter.mckinna at gmail.com
To: m3devel at elegosoft.com
Subject: [M3devel] AutoFlushWr race.

Hi,

 I thought I would have a try at fixing the problem in AutoFlushWr with the little program below.
Alas I couldnt get a grip on it so maybe you guys could shed some light.

This was with RC4 and on an amd64 linux box.

It invariably crashes in FileWr.Seek where some invariant seems to be
awry so obviously the worker thread created in AutoFlushWr is hitting or causing
a race condition. Probably easily fixed.

The thing that intrigued me was that most of the time running it did not cause
the normal crash sequence and I had to ctrl-c to get back to the bash prompt
which judging from the stack trace something in the threads area is amiss.


Regards Peter



MODULE Main;

IMPORT FileWr,Wr,Rd,AutoFlushWr,Stdio,IO,Fmt,Thread;

VAR
  myout := NEW(AutoFlushWr.T).init(Stdio.stdout);

BEGIN
  FOR i := 0 TO 10000 DO
    Wr.PutText(myout,Fmt.Int(i) & "\n");
  END;

(*
  Thread.Pause(1.0D0);
  IO.Put("hit enter to end\n");
*)

  EVAL Rd.GetChar(Stdio.stdin);

END Main.





... some iterations and
5155
5156
5157
5158
5159
5160
5161
5162
5163
5164
5165
5166
5167
5168
5169
5170
5171
5172
5173
5174


***
*** runtime error:
***    Exception "FileWr.Error" not in RAISES list
***    file "../src/rw/FileWr.m3", line 81
***


Program received signal SIGABRT, Aborted.
0x00007f1a8b6beed5 in raise () from /lib/libc.so.6

(m3gdb) thread apply all bt

Thread 3 (Thread 1106278736 (LWP 5197)):
#0  0x00007f1a8b9ead29 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#1  0x00007f1a8bef8d4f in ThreadPThread__pthread_cond_wait (cond=0x24d0170,
    mutex=0x24d0770) at ../src/thread/PTHREAD/ThreadPThreadC.c:326
#2  0x00007f1a8bef1ce0 in XWait (self=16_00000000024f2a48, m=16_00000000024ef030,
    c=16_00000000024ef0b8, alertable=FALSE)
    at ../src/thread/PTHREAD/ThreadPThread.m3:238
#3  0x00007f1a8bef214d in Wait (m=16_00000000024ef030, c=16_00000000024ef0b8)
    at ../src/thread/PTHREAD/ThreadPThread.m3:280
#4  0x00007f1a8c9e2c62 in Worker (cl=16_00000000024f2a38)
    at ../src/rw/AutoFlushWr.m3:211
#5  0x00007f1a8bef4447 in RunThread (me=16_00000000024d0a40)
    at ../src/thread/PTHREAD/ThreadPThread.m3:589
#6  0x00007f1a8bef4140 in ThreadBase (param=16_00000000024d0a40)
    at ../src/thread/PTHREAD/ThreadPThread.m3:565
#7  0x00007f1a8b9e6fc7 in start_thread () from /lib/libpthread.so.0
#8  0x00007f1a8b75c5ad in clone () from /lib/libc.so.6
#9  0x0000000000000000 in ?? ()

Thread 2 (Thread 139752010405600 (LWP 5194)):
#0  0x00007f1a8b6beed5 in raise () from /lib/libc.so.6
#1  0x00007f1a8b6c03f3 in abort () from /lib/libc.so.6
#2  0x00007f1a8beef0ff in Crash () at ../src/runtime/POSIX/RTOS.m3:20
#3  0x00007f1a8bee2b1d in Crash (msg=NIL) at ../src/runtime/common/RTProcess.m3:65
#4  0x00007f1a8bedffba in EndError (crash=TRUE) at ../src/runtime/common/RTError.m3:118
#5  0x00007f1a8bedfcb2 in MsgS (file=16_00007f1a8ccc84d8, line=81,
    msgA=16_00007f1a8c12dda8, msgB=16_000000000250e078, msgC=16_00007f1a8c12ddf0)
    at ../src/runtime/common/RTError.m3:40
#6  0x00007f1a8bee048d in Crash (a=
    RECORD exception = 16_00007f1a8c124ca0; arg = 16_000000000000000d; module = 16_00007f1a8ccc8540; line = 81; pc = NIL; info0 = NIL; info1 = NIL; un_except = 16_00007f1a8ccc8380; un_arg = NIL;  END, raises=TRUE, rte=16_00007f1a8c124ca0)
    at ../src/runtime/common/RTException.m3:79
#7  0x00007f1a8bee0186 in DefaultBackstop (a=
    RECORD exception = 16_00007f1a8c124ca0; arg = 16_000000000000000d; module = 16_00007f1a8ccc8540; line = 81; pc = NIL; info0 = NIL; info1 = NIL; un_except = 16_00007f1a8ccc8380; un_arg = NIL;  END, raises=FALSE) at ../src/runtime/common/RTException.m3:41
#8  0x00007f1a8bee0076 in InvokeBackstop (a=
    RECORD exception = 16_00007f1a8c124ca0; arg = 16_000000000000000d; module = 16_00007f1a8ccc8540; line = 81; pc = NIL; info0 = NIL; info1 = NIL; un_except = 16_00007f1a8ccc8380; un_arg = NIL;  END, raises=FALSE) at ../src/runtime/common/RTException.m3:25
#9  0x00007f1a8bef03db in Raise (act=
    RECORD exception = 16_00007f1a8c124ca0; arg = 16_000000000000000d; module = 16_00007f1a8ccc8540; line = 81; pc = NIL; info0 = NIL; info1 = NIL; un_except = 16_00007f1a8ccc8380; un_arg = NIL;  END) at ../src/runtime/ex_frame/RTExFrame.m3:29
---Type <return> to continue, or q <return> to quit---
#10 0x00007f1a8bee01f6 in DefaultBackstop (a=
    RECORD exception = 16_00007f1a8c124ca0; arg = 16_000000000000000d; module = 16_00007f1a8ccc8540; line = 81; pc = NIL; info0 = NIL; info1 = NIL; un_except = 16_00007f1a8ccc8380; un_arg = NIL;  END, raises=TRUE) at ../src/runtime/common/RTException.m3:47
#11 0x00007f1a8bee0076 in InvokeBackstop (a=
    RECORD exception = 16_00007f1a8c124ca0; arg = 16_000000000000000d; module = 16_00007f1a8ccc8540; line = 81; pc = NIL; info0 = NIL; info1 = NIL; un_except = 16_00007f1a8ccc8380; un_arg = NIL;  END, raises=TRUE) at ../src/runtime/common/RTException.m3:25
#12 0x00007f1a8bef04e8 in Raise (act=
    RECORD exception = 16_00007f1a8c124ca0; arg = 16_000000000000000d; module = 16_00007f1a8ccc8540; line = 81; pc = NIL; info0 = NIL; info1 = NIL; un_except = 16_00007f1a8ccc8380; un_arg = NIL;  END) at ../src/runtime/ex_frame/RTExFrame.m3:51
#13 0x00007f1a8bec791f in Raise (ex=16_00007f1a8ccc8380, arg=NIL,
    module=16_00007f1a8ccc8540, line=81) at ../src/runtime/common/RTHooks.m3:79
#14 0x00007f1a8c9dd8f4 in Seek (wr=16_00000000024e9448, n=24765)
    at ../src/rw/FileWr.m3:81
#15 0x00007f1a8c9e0b0e in Seek (wr=16_00000000024ef0d8, n=24765)
    at ../src/rw/AutoFlushWr.m3:106
#16 0x00007f1a8c9d684e in PutStringDefault (wr=16_00000000024ef0d8, a=
    {'5','1','7','5','\n'}) at ../src/rw/Wr.m3:193
#17 0x00007f1a8c9d5b4c in FastPutText (wr=16_00000000024ef0d8, t=16_000000000250e040)
    at ../src/rw/Wr.m3:105
#18 0x00007f1a8c9d598a in PutText (wr=16_00000000024ef0d8, t=16_000000000250e040)
    at ../src/rw/Wr.m3:93
#19 0x000000000040108c in Main (mode=1) at ../src/Main.m3:10
#20 0x00007f1a8bedf12b in RunMainBody (m=16_00000000006016c0)
    at ../src/runtime/common/RTLinker.m3:399
#21 0x00007f1a8bede312 in AddUnitI (m=16_00000000006016c0)
    at ../src/runtime/common/RTLinker.m3:113
#22 0x00007f1a8bede3b0 in AddUnit (b={"Main_M3", Declared at: ../src/Main.m3:8})
    at ../src/runtime/common/RTLinker.m3:122
#23 0x0000000000400f2c in main (argc=1, argv=0x7fff6d396f08, envp=0x7fff6d396f18)
    at _m3main.mc:4
#24 0x00007f1a8b6ab1a6 in __libc_start_main () from /lib/libc.so.6
#25 0x0000000000400e29 in _start ()

Thread 1 (LWP 5194):
#0  0x00007f1a8b6beed5 in raise () from /lib/libc.so.6
#1  0x00007f1a8b6c03f3 in abort () from /lib/libc.so.6
#2  0x00007f1a8beef0ff in Crash () at ../src/runtime/POSIX/RTOS.m3:20
#3  0x00007f1a8bee2b1d in Crash (msg=NIL) at ../src/runtime/common/RTProcess.m3:65
#4  0x00007f1a8bedffba in EndError (crash=TRUE) at ../src/runtime/common/RTError.m3:118
#5  0x00007f1a8bedfcb2 in MsgS (file=16_00007f1a8ccc84d8, line=81,
    msgA=16_00007f1a8c12dda8, msgB=16_000000000250e078, msgC=16_00007f1a8c12ddf0)
    at ../src/runtime/common/RTError.m3:40
#6  0x00007f1a8bee048d in Crash (a=
---Type <return> to continue, or q <return> to quit---
    RECORD exception = 16_00007f1a8c124ca0; arg = 16_000000000000000d; module = 16_00007f1a8ccc8540; line = 81; pc = NIL; info0 = NIL; info1 = NIL; un_except = 16_00007f1a8ccc8380; un_arg = NIL;  END, raises=TRUE, rte=16_00007f1a8c124ca0)
    at ../src/runtime/common/RTException.m3:79
#7  0x00007f1a8bee0186 in DefaultBackstop (a=
    RECORD exception = 16_00007f1a8c124ca0; arg = 16_000000000000000d; module = 16_00007f1a8ccc8540; line = 81; pc = NIL; info0 = NIL; info1 = NIL; un_except = 16_00007f1a8ccc8380; un_arg = NIL;  END, raises=FALSE) at ../src/runtime/common/RTException.m3:41
#8  0x00007f1a8bee0076 in InvokeBackstop (a=
    RECORD exception = 16_00007f1a8c124ca0; arg = 16_000000000000000d; module = 16_00007f1a8ccc8540; line = 81; pc = NIL; info0 = NIL; info1 = NIL; un_except = 16_00007f1a8ccc8380; un_arg = NIL;  END, raises=FALSE) at ../src/runtime/common/RTException.m3:25
#9  0x00007f1a8bef03db in Raise (act=
    RECORD exception = 16_00007f1a8c124ca0; arg = 16_000000000000000d; module = 16_00007f1a8ccc8540; line = 81; pc = NIL; info0 = NIL; info1 = NIL; un_except = 16_00007f1a8ccc8380; un_arg = NIL;  END) at ../src/runtime/ex_frame/RTExFrame.m3:29
#10 0x00007f1a8bee01f6 in DefaultBackstop (a=
    RECORD exception = 16_00007f1a8c124ca0; arg = 16_000000000000000d; module = 16_00007f1a8ccc8540; line = 81; pc = NIL; info0 = NIL; info1 = NIL; un_except = 16_00007f1a8ccc8380; un_arg = NIL;  END, raises=TRUE) at ../src/runtime/common/RTException.m3:47
#11 0x00007f1a8bee0076 in InvokeBackstop (a=
    RECORD exception = 16_00007f1a8c124ca0; arg = 16_000000000000000d; module = 16_00007f1a8ccc8540; line = 81; pc = NIL; info0 = NIL; info1 = NIL; un_except = 16_00007f1a8ccc8380; un_arg = NIL;  END, raises=TRUE) at ../src/runtime/common/RTException.m3:25
#12 0x00007f1a8bef04e8 in Raise (act=
    RECORD exception = 16_00007f1a8c124ca0; arg = 16_000000000000000d; module = 16_00007f1a8ccc8540; line = 81; pc = NIL; info0 = NIL; info1 = NIL; un_except = 16_00007f1a8ccc8380; un_arg = NIL;  END) at ../src/runtime/ex_frame/RTExFrame.m3:51
#13 0x00007f1a8bec791f in Raise (ex=16_00007f1a8ccc8380, arg=NIL,
    module=16_00007f1a8ccc8540, line=81) at ../src/runtime/common/RTHooks.m3:79
#14 0x00007f1a8c9dd8f4 in Seek (wr=16_00000000024e9448, n=24765)
    at ../src/rw/FileWr.m3:81
#15 0x00007f1a8c9e0b0e in Seek (wr=16_00000000024ef0d8, n=24765)
    at ../src/rw/AutoFlushWr.m3:106
#16 0x00007f1a8c9d684e in PutStringDefault (wr=16_00000000024ef0d8, a=
    {'5','1','7','5','\n'}) at ../src/rw/Wr.m3:193
#17 0x00007f1a8c9d5b4c in FastPutText (wr=16_00000000024ef0d8, t=16_000000000250e040)
    at ../src/rw/Wr.m3:105
#18 0x00007f1a8c9d598a in PutText (wr=16_00000000024ef0d8, t=16_000000000250e040)
    at ../src/rw/Wr.m3:93
#19 0x000000000040108c in Main (mode=1) at ../src/Main.m3:10
#20 0x00007f1a8bedf12b in RunMainBody (m=16_00000000006016c0)
    at ../src/runtime/common/RTLinker.m3:399
#21 0x00007f1a8bede312 in AddUnitI (m=16_00000000006016c0)
    at ../src/runtime/common/RTLinker.m3:113
#22 0x00007f1a8bede3b0 in AddUnit (b={"Main_M3", Declared at: ../src/Main.m3:8})
---Type <return> to continue, or q <return> to quit---
    at ../src/runtime/common/RTLinker.m3:122
#23 0x0000000000400f2c in main (argc=1, argv=0x7fff6d396f08, envp=0x7fff6d396f18)
    at _m3main.mc:4
#24 0x00007f1a8b6ab1a6 in __libc_start_main () from /lib/libc.so.6
#25 0x0000000000400e29 in _start ()
#0  0x00007f1a8b6beed5 in raise () from /lib/libc.so.6
(m3gdb)     


Now why is the stack trace duplicated for Thread 1 and Thread 2 and also
why is there that last line #0  0x00007f1a8b6beed5 in raise () from /lib/libc.so.6 in the
last frame? The first and last frame are duplicated. Havent seen it in other stack traces.
Maybe something wrong with my setup?


running once more we get

....... some iterations
4311
4312
4313
4314
4315
4316
4317
4318


***
*** runtime error:
***    Exception "FileWr.Error" not in RAISES list
***    file "../src/rw/FileWr.m3", line 81
***

OK here I had to ctrl-c it since it is hung


^C
Program received signal SIGINT, Interrupt.
0x00007fd82e657384 in __lll_lock_wait () from /lib/libpthread.so.0
(m3gdb) thread apply all bt

Thread 3 (Thread 1113143632 (LWP 5200)):
#0  0x00007fd82e654d29 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/libpthread.so.0
#1  0x00007fd82eb62d4f in ThreadPThread__pthread_cond_wait (cond=0x19a9170,
    mutex=0x19a9770) at ../src/thread/PTHREAD/ThreadPThreadC.c:326
#2  0x00007fd82eb5bce0 in XWait (self=16_00000000019caa48, m=16_00000000019c7030,
    c=16_00000000019c70b8, alertable=FALSE)
    at ../src/thread/PTHREAD/ThreadPThread.m3:238
#3  0x00007fd82eb5c14d in Wait (m=16_00000000019c7030, c=16_00000000019c70b8)
    at ../src/thread/PTHREAD/ThreadPThread.m3:280
#4  0x00007fd82f64cc62 in Worker (cl=16_00000000019caa38)
    at ../src/rw/AutoFlushWr.m3:211
#5  0x00007fd82eb5e447 in RunThread (me=16_00000000019a9a40)
    at ../src/thread/PTHREAD/ThreadPThread.m3:589
#6  0x00007fd82eb5e140 in ThreadBase (param=16_00000000019a9a40)
    at ../src/thread/PTHREAD/ThreadPThread.m3:565
#7  0x00007fd82e650fc7 in start_thread () from /lib/libpthread.so.0
#8  0x00007fd82e3c65ad in clone () from /lib/libc.so.6
#9  0x0000000000000000 in ?? ()

Thread 2 (Thread 140566490150624 (LWP 5198)):
#0  0x00007fd82e657384 in __lll_lock_wait () from /lib/libpthread.so.0
#1  0x00007fd82e652bf0 in _L_lock_102 () from /lib/libpthread.so.0
../../gdb/gdb/dwarf2-frame.c:490: internal-error: Unknown CFI encountered.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n)              


Is this a problem with m3gdb?  

Running this program without the debugger more likely than not will produce a hung program, so 
I suspect some threading problem. That lll_lock_wait is definitely suspicious.


 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20091207/a0174d82/attachment-0002.html>


More information about the M3devel mailing list