[M3devel] getting all registers in gc?
Jay K
jay.krell at cornell.edu
Mon Jun 27 22:32:29 CEST 2016
> I think getcontext will go far toward providing peace of mind on many platform
That is surprisingly a syscall where I've now looked.
Surprising and slightly bad.
In the name of thread suspension, it might be affordable.
Still might copy that underlying code.
We need to solve this even with cooperative suspend btw -- which I really want...
- Jay
----------------------------------------
> From: jay.krell at cornell.edu
> To: hosking at purdue.edu
> CC: m3devel at elegosoft.com
> Subject: RE: [M3devel] getting all registers in gc?
> Date: Mon, 27 Jun 2016 20:04:27 +0000
>
> How do we ensure that?
>
> What about other backends, e.g. C or LLVM?
>
> The Boehm collector code implies the situation is under control, if we are like it, at least.
>
> I think getcontext will go far toward providing peace of mind on many platforms (and RtlCaptureContext
> on NT).
>
> I might still be convinced that setjmp suffices.
> In that, compiler will spill volatiles ahead of it anyway.
> I wonder if that suffices, but I haven't convinced myself.
>
> - Jay
>
> ----------------------------------------
>> From: hosking at purdue.edu
>> To: jay.krell at cornell.edu
>> CC: m3devel at elegosoft.com
>> Subject: Re: [M3devel] getting all registers in gc?
>> Date: Mon, 27 Jun 2016 09:50:06 +0000
>>
>> A thought: can we simply ensure that the gcc-based backend never spills pointers to FP regs?
>>
>>> On 27 Jun 2016, at 7:14 PM, Jay K <jay.krell at cornell.edu> wrote:
>>>
>>> ok, the Boehm code:
>>>
>>> For the current live thread, merely:
>>>
>>> /* Push enough of the current stack eagerly to */
>>> /* ensure that callee-save registers saved in */
>>> /* GC frames are scanned. */
>>> /* In the non-threads case, schedule entire */
>>> /* stack for scanning. */
>>> /* The second argument is a pointer to the */
>>> /* (possibly null) thread context, for */
>>> /* (currently hypothetical) more precise */
>>> /* stack scanning. */
>>> /*
>>> * In the absence of threads, push the stack contents.
>>> * In the presence of threads, push enough of the current stack
>>> * to ensure that callee-save registers saved in collector frames have been
>>> * seen.
>>> * FIXME: Merge with per-thread stuff.
>>> */
>>> /*ARGSUSED*/
>>> STATIC void GC_push_current_stack(ptr_t cold_gc_frame, void * context)
>>> {
>>> # if defined(THREADS)
>>> if (0 == cold_gc_frame) return;
>>> # ifdef STACK_GROWS_DOWN
>>> GC_push_all_eager(GC_approx_sp(), cold_gc_frame);
>>> /* For IA64, the register stack backing store is handled */
>>> /* in the thread-specific code. */
>>> # else
>>> GC_push_all_eager(cold_gc_frame, GC_approx_sp());
>>> # endif
>>> # else
>>> ...
>>> # endif /* !THREADS */
>>>
>>>
>>> GC_INNER ptr_t GC_approx_sp(void)
>>> {
>>> volatile word sp;
>>> sp = (word)&sp;
>>> /* Also force stack to grow if necessary. Otherwise the */
>>> /* later accesses might cause the kernel to think we're */
>>> /* doing something wrong. */
>>> return((ptr_t)sp);
>>> /* GNU C: alternatively, we may return the value of */
>>> /*__builtin_frame_address(0). */
>>> }
>>>
>>>
>>> Notice that it doesn't even do what it says -- no attempt
>>> to save registers to stack.
>>>
>>>
>>> but for suspended threads it is more convincing:
>>>
>>>
>>> /* Ensure that either registers are pushed, or callee-save registers */
>>> /* are somewhere on the stack, and then call fn(arg, ctxt). */
>>> /* ctxt is either a pointer to a ucontext_t we generated, or NULL. */
>>> GC_INNER void GC_with_callee_saves_pushed(void (*fn)(ptr_t, void *),
>>> ptr_t arg)
>>> {
>>> volatile int dummy;
>>> void * context = 0;
>>>
>>>
>>> ..
>>> ....
>>>
>>>
>>> a mix of methods:
>>> - sometimes processor specific assembly
>>> - sometimes getcontext, and then workaround a bug on Linux/amd64
>>> - and then _setjmp on Unix
>>> - setjmp on Windows
>>>
>>>
>>> getcontext to me seems more promising than setjmp,
>>> and you can use both
>>>
>>> for Win32, I suggest RtlCaptureContext (for live thread too)
>>>
>>>
>>> We maybe should copy getcontext from various BSDs?
>>> i.e. Win32 RtlCaptureContext, else carry the assembly with us (no need
>>> to worry about the glibc getcontext bug).
>>>
>>> or maybe just getcontext. The gradually expanding register set on x86 makes me nervous
>>> that this isn't a maintenance problem, but I'm guessing you never get pointers
>>> spilled to the ymm/zmm registers.
>>>
>>> - Jay
>>>
>>>
>>>
>>> ----------------------------------------
>>>> From: jay.krell at cornell.edu
>>>> To: m3devel at elegosoft.com
>>>> Date: Mon, 27 Jun 2016 08:37:12 +0000
>>>> Subject: [M3devel] getting all registers in gc?
>>>>
>>>> I just noticed this in the Boehm GC documentation:
>>>>
>>>> - Changed the alpha port to use the generic register scanning code instead
>>>> of alpha_mach_dep.s. Alpha_mach_dep.s doesn't look for pointers in fp
>>>> registers, but gcc sometimes spills pointers there. (Thanks to Manuel
>>>> Serrano for helping me debug this by email.) Changed the IA64 code to
>>>> do something similar for similar reasons.
>>>>
>>>>
>>>> This would seem like a hazard for us too.
>>>>
>>>> And not convincingly Alpha/IA64-specific.
>>>>
>>>> We basically assume setjmp stores a context, or at least all live pointers.
>>>>
>>>>
>>>> In hindsight I see two problems:
>>>> - one alluded to -- jmpbuf might not have floating point registers,
>>>> and floating point registers might have pointers.
>>>>
>>>>
>>>> - Same thing but more general: jmpbuf might not even have all integer
>>>> registers?
>>>>
>>>>
>>>>
>>>> So that leaves the question "What is generic register scanning code"?
>>>>
>>>> I don't know yet but..thinking...
>>>>
>>>> Maybe we should instead use Posix-deprecated getcontext and Win32 RtlCaptureContext?
>>>>
>>>> I'm actually looking for how Boehm gc gets the "second half" of the IA64 stack,
>>>> as I think that is a lingering thing we need to handle to finish our portability.
>>>>
>>>> Ignoring IA64 for now, maybe here:
>>>>
>>>> void
>>>> __cdecl
>>>> ThreadPThread__sigsuspend(void)
>>>> {
>>>> struct {
>>>> sigjmp_buf jb;
>>>> } s;
>>>>
>>>> ZERO_MEMORY(s);
>>>>
>>>> if (sigsetjmp(s.jb, 0) == 0) /* save registers to stack */
>>>> #ifdef M3_REGISTER_WINDOWS
>>>> siglongjmp(s.jb, 1); /* flush register windows */
>>>> else
>>>> #endif
>>>> sigsuspend(&mask);
>>>> }
>>>>
>>>>
>>>> and here:
>>>>
>>>> void
>>>> __cdecl
>>>> ThreadPThread__ProcessLive(char *bottom, void (*p)(void *start, void *limit))
>>>> {
>>>> struct {
>>>> sigjmp_buf jb;
>>>> } s;
>>>>
>>>> ZERO_MEMORY(s);
>>>>
>>>> if (sigsetjmp(s.jb, 0) == 0) /* save registers to stack */
>>>>
>>>>
>>>> we should use getcontext/RtlCaptureContext/GetThreadContext?
>>>>
>>>>
>>>> I'll look more at the Boehm code.
>>>>
>>>>
>>>> - Jay
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> M3devel mailing list
>>>> M3devel at elegosoft.com
>>>> https://m3lists.elegosoft.com/mailman/listinfo/m3devel
>>>
>>> _______________________________________________
>>> M3devel mailing list
>>> M3devel at elegosoft.com
>>> https://m3lists.elegosoft.com/mailman/listinfo/m3devel
>>
>
More information about the M3devel
mailing list