[M3devel] Avoding thread-stack overflow, proof of concept of "leapfrog stacks"

Mika Nystrom mika at async.caltech.edu
Wed Feb 23 21:39:50 CET 2011


Jay K writes:
>--_60ddb4ae-bef3-432c-a5fd-a46c878ce951_
>Content-Type: text/plain; charset="iso-8859-1"
>Content-Transfer-Encoding: quoted-printable
>
>
>Stack must be contiguous on NT.
>Some platforms -- NT -- actually know how big the current stack is -- and w=
>on't let you
>alloca past it. Best way to let the target handle making stacks is with Pos=
>ix makecontext=2C
>for the many-but-not-all platforms that have it=2C or CreateFiber on NT.
>(OpenBSD and I think Darwin/amd64 lack makecontext. It is in Posix=2C but d=
>eprecated=3B
>heck Darwin doesn't even have Posix semaphores seemingly..)

The problem with making stacks with makecontext, etc., to me seems
to be: what do you do when you want to return to the original stack.
Can it be made to "just work"?

Note that in my design, the stacks are all contiguous---but overlapping!
There are no "new" stacks and no switching of stacks.  You just pretend
you have a huge data structure sitting where the other guys' stacks
(and data, etc.) are.

Cleanup is also easy, it's handled by the ordinary GC.

>
>To avoid the assembly you could use a hack like:
>void* Unsafe__alloca(size_t n) { return alloca(n)=3B }

Won't work.  alloca releases memory at the }.

What would avoid it is generating the call to alloca in the front end.

>
>Beware.
>Dangerous platform/compiler-specific stuff here.

Not really that specific.  It relies on precisely one special property
of the underlying system: that alloca doesn't do any error checking.

The only changes that need to be made to CM3 are coming up with a
mechanism for detecting the need for expanding the stack (could be done by
bumping the redzone before winding the stack and handling this in a SEGV
handler) and modifying the garbage collector to understand the scheme.
I suggest adding a back pointer at the beginning of the newly allocated
area that would let the GC go back to the previous one while scanning
(or however it works).

Users/platforms that don't need or can't support "leapfrog stacks"
would not change except that their garbage collectors would have some
code that's never executed...

I've discussed the above with people who've been doing concurrent
programming since before the glaciers came thru (implementations of
Concurrent Pascal in the 1970s) and they said that they used linked
(noncontiguous) stacks in those days.  The state of the art of 2011
seems rather unimpressive...

My feeling is that the stack overflow issue is one of the things hindering
the acceptance of truly multi-threaded programming (thousands, tens of
thousands, millions of threads).

     Mika




More information about the M3devel mailing list