[M3devel] Avoding thread-stack overflow, proof of concept of "leapfrog stacks"

Wed Feb 23 21:51:35 CET 2011

 > that alloca doesn't do any error checking

No. There is also native exception handling to consider.
NT essentially does error checking in both.

> thousands, tens of thousands, millions of threads

 - 64bit address spaces
 - multiple machines ("cloud")

 > Won't work.  alloca releases memory at the }.

 Only sort of.
 You can still return that point and still use the memory.
 You do risk, e.g. if you make any function calls, then the
 alloca memory and the function parameters might overlap.
 You could..platform-dependent..probably skip the first
 "few" bytes of the alloca memory.

 > Note that in my design, the stacks are all contiguous---but overlapping!

 Are your stacks also traversing heap and what-not,
 or are you just chopping up the current stack into lots
 of smaller stacks?

 i.e. alloca stack size times n and then create n threads?
 That's less bad.

 I had the impression you were skipping over other memory.

 You know, like, malloc your stack, and then alloca to get
 the stack pointer to hit the malloced memory.

 > What would avoid it is generating the call to alloca in the front end.

We could w/o much difficulty provide alloca in the frontend.
Probably limit it to unsafe modules.
At least to support your experimenting and somewhat platform-dependent code.

 > in the 1970s..

I think the right place to experiment with this stuff is at a lower level.
Or with more complete control of the system.
ie. in the kernel, or on only one specific kernel, that you
build/modify/write-from-scratch yourself.

Gotta go,
 - Jay

> To: jay.krell at cornell.edu
> Date: Wed, 23 Feb 2011 12:39:50 -0800
> From: mika at async.caltech.edu
> CC: m3devel at elegosoft.com
> Subject: Re: [M3devel] Avoding thread-stack overflow,	proof of concept of "leapfrog stacks"
> 
> Jay K writes:
> >--_60ddb4ae-bef3-432c-a5fd-a46c878ce951_
> >Content-Type: text/plain; charset="iso-8859-1"
> >Content-Transfer-Encoding: quoted-printable
> >
> >
> >Stack must be contiguous on NT.
> >Some platforms -- NT -- actually know how big the current stack is -- and w=
> >on't let you
> >alloca past it. Best way to let the target handle making stacks is with Pos=
> >ix makecontext=2C
> >for the many-but-not-all platforms that have it=2C or CreateFiber on NT.
> >(OpenBSD and I think Darwin/amd64 lack makecontext. It is in Posix=2C but d=
> >eprecated=3B
> >heck Darwin doesn't even have Posix semaphores seemingly..)
> 
> The problem with making stacks with makecontext, etc., to me seems
> to be: what do you do when you want to return to the original stack.
> Can it be made to "just work"?
> 
> Note that in my design, the stacks are all contiguous---but overlapping!
> There are no "new" stacks and no switching of stacks.  You just pretend
> you have a huge data structure sitting where the other guys' stacks
> (and data, etc.) are.
> 
> Cleanup is also easy, it's handled by the ordinary GC.
> 
> >
> >To avoid the assembly you could use a hack like:
> >void* Unsafe__alloca(size_t n) { return alloca(n)=3B }
> 
> Won't work.  alloca releases memory at the }.
> 
> What would avoid it is generating the call to alloca in the front end.
> 
> >
> >Beware.
> >Dangerous platform/compiler-specific stuff here.
> 
> Not really that specific.  It relies on precisely one special property
> of the underlying system: that alloca doesn't do any error checking.
> 
> The only changes that need to be made to CM3 are coming up with a
> mechanism for detecting the need for expanding the stack (could be done by
> bumping the redzone before winding the stack and handling this in a SEGV
> handler) and modifying the garbage collector to understand the scheme.
> I suggest adding a back pointer at the beginning of the newly allocated
> area that would let the GC go back to the previous one while scanning
> (or however it works).
> 
> Users/platforms that don't need or can't support "leapfrog stacks"
> would not change except that their garbage collectors would have some
> code that's never executed...
> 
> I've discussed the above with people who've been doing concurrent
> programming since before the glaciers came thru (implementations of
> Concurrent Pascal in the 1970s) and they said that they used linked
> (noncontiguous) stacks in those days.  The state of the art of 2011
> seems rather unimpressive...
> 
> My feeling is that the stack overflow issue is one of the things hindering
> the acceptance of truly multi-threaded programming (thousands, tens of
> thousands, millions of threads).
> 
>      Mika
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://m3lists.elegosoft.com/pipermail/m3devel/attachments/20110223/595e9a35/attachment-0002.html>