[M3devel] Question about TEXTs under CM3
Tony Hosking
hosking at cs.purdue.edu
Thu Jul 26 18:02:19 CEST 2007
I'm about to check in the fix. I have tested this with your example
program and things work fine.
On Jul 26, 2007, at 11:26 AM, Tony Hosking wrote:
> On closer inspection, it is even messier than that. In the old PM3
> the text that is constructed is allocated in UNTRACED storage
> (M3TextWithHeader is an UNTRACED REF RECORD...) so it can't be
> GC'd, and looks (to the GC) like an (old-style) text literal
> allocated outside the heap. It is harder for us to fake up a text
> literal like this with the new TEXT setup since we'd need to
> allocate, copy, and loophole something that looks like a text
> literal (an object) in the untraced heap. The best alternative is
> to fix the pickler with a special that handles Text8Cstrings. I
> think this is the cleanest approach. Mika, please try the
> following program, which adds a special for Text8CString. I will
> put this code into the builtin specials for pickles.
>
> On Jul 26, 2007, at 9:47 AM, Tony Hosking wrote:
>
>> Looks like we need to fix M3toC.StoT so that it works the same as
>> old PM3. The old code works because it constructs a text *in* the
>> heap as an ARRAY OF CHAR, that just happens to have its payload
>> (the array contents) outside the heap. (Arrays in M3 contain a
>> reference to their data). We can play the same trick for
>> Text8CString and get things to work properly for you. I will make
>> this fix and check it in.
>>
>>
>> On Jul 26, 2007, at 8:40 AM, Mika Nystrom wrote:
>>
>>> Ok, I am about to answer my own email. Here's a little program I
>>> wrote:
>>>
>>> MODULE Main;
>>> IMPORT Pickle, IO, Params, TextWr, TextRd;
>>>
>>> VAR wr := NEW(TextWr.T).init();
>>> toPickle := "pickle this";
>>> BEGIN
>>> IF Params.Count > 1 THEN toPickle := Params.Get(1) END;
>>>
>>> Pickle.Write(wr,toPickle);
>>>
>>> IO.Put("pickled \""&toPickle&"\"\n");
>>> IO.Put("read back \""&
>>> Pickle.Read(NEW(TextRd.T).init(TextWr.ToText(wr)))
>>> &"\"\n");
>>> END Main.
>>>
>>> === On FreeBSD4 with my ancient, creaky PM3:
>>>
>>> (64)trs80:~/ptest/src>../FreeBSD4/pickleit
>>> pickled "pickle this"
>>> read back "pickle this"
>>> (65)trs80:~/ptest/src>../FreeBSD4/pickleit pickle\ that
>>> pickled "pickle that"
>>> read back "pickle that"
>>>
>>> === On PPC_DARWIN with the latest CM3:
>>>
>>> [QT:~/ptest/src] mika% ../PPC_DARWIN/pickleit
>>> pickled "pickle this"
>>> read back "pickle this"
>>> [QT:~/ptest/src] mika% ../PPC_DARWIN/pickleit pickle\ that
>>> pickled "pickle that"
>>>
>>>
>>> ***
>>> *** runtime error:
>>> *** Segmentation violation - possible attempt to dereference NIL
>>> ***
>>>
>>> Abort
>>>
>>> === On FreeBSD4 with the latest CM3:
>>>
>>> (73)rover:~/ptest/src>../FreeBSD4/pickleit
>>> pickled "pickle this"
>>> read back "pickle this"
>>> (74)rover:~/ptest/src>../FreeBSD4/pickleit pickle\ that
>>> pickled "pickle that"
>>>
>>>
>>> ***
>>> *** runtime error:
>>> *** Segmentation violation - possible attempt to dereference NIL
>>> ***
>>>
>>> Abort
>>>
>>> ============
>>>
>>> Diagnosis:
>>>
>>> The code I mentioned in a previous email declares a Text8CString.T
>>> to be of type TEXT OBJECT str: Ctypes.char_star; END . Elsewhere
>>> in the system, Ctypes.char_star is specifically declared to be an
>>> "UNTRACED REF Ctypes.char". According to the specification of
>>> Pickle, an UNTRACED REF is pickled as NIL.
>>>
>>> Generally speaking, you don't see many Text8CString.Ts in the
>>> system.
>>> This one comes in via Params.Get, which in turn calls RTArgs.GetArg,
>>> which in turn calls M3toC.StoT.
>>>
>>> StoT is generally not something you want to call, but it's supposed
>>> to be OK here because you're just passing around argv, which won't
>>> change. Generally speaking, it ought to be OK to use to convert
>>> C strings whose addresses aren't going to change during the lifetime
>>> of the program.
>>>
>>> I think it is *totally unacceptable* that Params.Get returns a
>>> "poisoned" TEXT. There is absolutely nothing in any of the
>>> interfaces
>>> that warns of this fact, and it is surprising to say the least.
>>> There is also no simple way of "copying" TEXTs, as it shouldn't
>>> ever be required! Finally, there's no way of checking whether a
>>> TEXT is a Text8CString.T without importing Text8CString, which is
>>> an UNSAFE INTERFACE, which of course is illegal in a safe MODULE!!
>>>
>>> What does baffle me a bit is that the code works on the old PM3.
>>> It uses the old TextF, which declares
>>>
>>> TEXT = BRANDED Text.Brand REF ARRAY OF CHAR;
>>>
>>> The old (from PM3, I think it's 1.1.15) M3toC has:
>>>
>>> PROCEDURE StoT (s: Ctypes.char_star): TEXT =
>>> VAR t := NEW (M3TextWithHeader);
>>> BEGIN
>>> t.header.typecode := RT0.TextTypecode;
>>> t.body.chars := LOOPHOLE (s, ADDRESS);
>>> t.body.length := 1 + Cstring.strlen (s);
>>> RETURN LOOPHOLE (ADR (t.body), TEXT);
>>> END StoT;
>>>
>>> I'm not entirely sure why the old code works---why isn't the
>>> M3TextWithHeader garbage-collected? The pickler doesn't seem to
>>> know that the result of LOOPHOLE (ADR (t.body), TEXT) is special.
>>> In fact the pickler doesn't seem to know anything about TEXTs
>>> at all.
>>>
>>> I see several possible ways of solving the problem. One is to
>>> permit M3toC.StoT to remain broken (since M3toC is an UNSAFE
>>> interface, there's no "legal" reason not to do that) and make sure
>>> Params.Get (and everything else that could remotely be called
>>> "standard") doesn't use M3toC.StoT---oh and to leave some very
>>> prominent warning signs, both in M3toC and Text8CString that "here
>>> be demons". Another is to revert to the SRC method of LOOPHOLEing
>>> the C strings. I never liked the CM3 TEXTs, and I like them even
>>> less now; I disliked them from the start because they broke pickle
>>> compatibility with SRC M3, and now I find that they aren't even
>>> compatible with themselves.
>>>
>>> Modula-3's strengths have always been its utter simplicity
>>> and bullet-proof reliability. This stuff, where some objects are
>>> more "serializable" than others, reminds me of Java!
>>>
>>> Does anyone have an opinion as to how this problem ought to be
>>> solved?
>>>
>>> Mika
>>>
>>> Mika Nystrom writes:
>>>> Hello everyone,
>>>>
>>>> Here's a random question that hopefully someone can answer...
>>>>
>>>> Is it true that certain TEXTs can't be Pickled under CM3?
>>>>
>>>> Unless there's some magic I don't understand going on, I am
>>>> thinking
>>>> that would be the implication of:
>>>>
>>>> UNSAFE INTERFACE Text8CString;
>>>> ...
>>>> TYPE
>>>> T <: Public;
>>>> Public = TEXT OBJECT
>>>> str: Ctypes.char_star;
>>>> END;
>>>> (* The array contains the characters of the text followed by a
>>>> '\000'. *)
>>>> ...
>>>>
>>>> I hope someone can set me straight and tell me that no, the
>>>> situation
>>>> isn't as dire as that---some clever thing somewhere makes it
>>>> possible
>>>> to Pickle all objects of type TEXT...
>>>>
>>>> Mika
More information about the M3devel
mailing list