[M3devel] warning for uninitialized variables?
Jay K
jay.krell at cornell.edu
Wed Jun 2 16:52:55 CEST 2010
Running with a consistent initialized/uninitialized 0 is better than running with a "random" value that you get from no initialization.
They are both "bad".
The 0 is agreed misleading, because it might look initialized.
But it still better than a value that can change from run to run.
C int (or ptrdiff_t) and Modula-3 INTEGER are equivalent. They both are "valid" for any bit pattern.
C programmers always consider it a bug to use uninitialized values, except, the case I pointed out, looking for entropy to generate random numbers.
>> The fact that a variable is uninitialized is a part of the design of the program.
Not if it can be used uninitialized.
Some of the warnings from gcc are definitive -- the values *are* used uninitialized.
Sometimes it says "is used uninitialized", sometimes it says "might be used initialized", and usually it is quiet, either because it knows it isn't used uninitialized or because it is too uncertain.
Programs are written for humans to read, true, but programs that read programs are much more diligent, energetic, and often smarter (and slowly getting smarter with age, often the opposite of humans).
Compilers really do have a big "leg" up on humans, since they can harness fast digital processor with big storage.
Granted, neither is always better than the other. But compilers/computers have far more capacity for code reading than humans and I know from experience that most code is never read by a human, not even once, until I get to it :), because the bugs I readily see wouldn't survive even a glance...
(and the compilers often fail too, granted).
Again I'm quite surprised we want to leave uninitialized variables that get used uninitialized.
The one in cm3ide for example, if the input is malformed, I believe it uses uninitialized data.
I don't see how this can be deemed anything other than a bug to be fixed, and that we should be glad the compiler (finally) found it..now that I'm working on removing the rampant volatile...
If we can invest in runtime checks to immediately catch reads of unwritten data, then I'd favor that instead.
- Jay
----------------------------------------
> From: hosking at cs.purdue.edu
> Date: Wed, 2 Jun 2010 10:41:47 -0400
> To: mika at async.async.caltech.edu
> CC: m3devel at elegosoft.com; jay.krell at cornell.edu
> Subject: Re: [M3devel] warning for uninitialized variables?
>
> Hear, hear!
>
> On 2 Jun 2010, at 08:44, Mika Nystrom wrote:
>
>> Jay K writes:
>>>
>>> Wow. You really surprise me Mika.
>>>
>>>
>>> To warn on this kind of thing=2C to me=2C there is no question.
>>>
>>>
>>> Limting the affects by virtue of being safe isn't much consolation.
>>> The program can easily go down a "safe" but very incorrect path.
>>
>> It is, though! Do you always know what value to initialize the
>> variables to? If not, you're just picking some value to shut up
>> the compiler, and now your code no longer expresses---to a human
>> reader---that x is uninitialized.
>>
>> Remember, code is mainly written for humans! Computers don't care
>> about fancy structuring, declarations, etc. When you write a program,
>> you write it to be read by a human. The fact that a variable is
>> uninitialized is a part of the design of the program.
>>
>> It's an unfortunate fact that our languages don't have all the
>> mechanisms necessary to deal properly with uninitialized variables.
>> Retrofitting mechanisms to something that just isn't expressive
>> enough to solve the problem I really don't think is the way to go.
>>
>> EWD's book chapter... read it. "An essay on the scope of variables"
>> I think it is called.
>>
>>
>>>
>>>
>>> There are several forms.
>>> In fact=2C sometimes the compiler says something "is" used uninitialized.
>>> Sometimes it says "maybe".
>>>
>>>
>>> In your example=2C generally=2C once the address is taken of a variable=2C
>>> the compiler gives up. Esp. if that address is passed to a function it does=
>>> n't "see".
>>
>> As long as this doesn't raise warnings I can live with your warnings...
>>
>> ...
>>>
>>>
>>> There are many more warnings in the tree than I have yet looked at.
>>> I've seen a few forms.
>>> Some are because the compiler doesn't know that some functions don't return=
>>> .
>>> I'll see about improving that. Though I think there's a limit to how good I=
>>> can manage=2C
>>> unless we add a <* noreturn *> pragma.
>>
>> adding <*ASSERT FALSE*> in the right places might fix these warnings.
>>
>> ...
>>>
>>>
>>> I grant that by initializing to 0=2C I haven't necessarily made the code co=
>>> rrect either.
>>> But at least it is now consistent.
>>>
>>>
>>> Even better would be runtime checks that stop if a variable is read before =
>>> written.
>>
>> Which you have just defeated with your initialization!
>>
>> I know that my Modula code will outlive its current compiler. Probably
>> by a long time. Once you add the initialization you're talking about,
>> you can't go back. It no longer expresses the useful property you're
>> trying to check. When the new compiler comes along with its runtime
>> check, it will miss that my variables are actually uninitialized.
>>
>>>
>>>
>>> See also the rules in Java and C# that require the compiler to be able to p=
>>> rove
>>> a variable is written before read. Possibly even verifiable at load time.
>>
>> I *hate* this. See above.
>>
>> What's next? The compiler refuses to compile loops that it can't
>> prove will terminate??? Hogwash!
>>
>>>
>>>
>>> Other forms look related to going "through" a switch statement with no arms=
>>> handling the value.
>>> I think I can address that by marking fault_proc as noreturn.
>>>
>>>
>>> One form apparently in cm3ide would use uninitialized data if presented wit=
>>> h a malformed file.
>>> Good software doesn't depend on the well formedness of its input.
>>>
>>> There are many more warnings like this in the tree.
>>>
>>>
>>> My inclination is at least to temporarily hardcode the gcc backend to alway=
>>> s optimize=2C
>>> and always produce these warnings. People can look over them maybe and deci=
>>> de.
>>> Or they can look over my "fixes".
>>>
>>>
>>> Or I guess if people really really really really prefer=2C we can always tu=
>>> rn off the warnings
>>> and let the code lie. I really think that is the wrong answer.
>>>
>>>
>>> I have been burned too much by uninitialized locals in C.
>>> I put " =3D { 0 }" after everything.
>>> Again=2C that isn't necessarily "correct"=2C but at least the incorrect pat=
>>> hs are consistent.
>>> If they don't work and I happen down them=2C they will guaranteeably not wo=
>>> rk.
>>
>> No they will just consistently not work. Not "guaranteeably". You have
>> to initialize them to some specific value for that to be the case.
>> Using "0" often will just give you the wrong answer!
>>
>>>
>>>
>>> Uninitialized values can be different run to run.
>>> Repeatability and consistency are important.
>>
>> Not forcing the programmer to obscure the meaning of the code I think
>> is more important.
>>
>>>
>>>
>>> I'm also nervous about us not taking a hard line on integer overflow.
>>> Overflown math doesn't necessarily lead to array out of bounds soon or ever=
>>> .
>>> It too can lead to incorrect but safe behavior.
>>
>> Similar situation is it not? You don't generate a compiler warning
>> for math that "might overflow".
>>
>> I think we can all agree that error on integer overflow, error on
>> use of uninitialized variable, at runtime, would both be good things?
>>
>> Mika
>>
>>>
>>>
>>> =A0- Jay
>>>
>>>
>>> ----------------------------------------
>>>> To: jay.krell at cornell.edu
>>>> Date: Tue=2C 1 Jun 2010 20:52:26 -0700
>>>> From: mika at async.async.caltech.edu
>>>> CC: m3devel at elegosoft.com
>>>> Subject: Re: [M3devel] [M3commit] CVS Update: cm3
>>>>
>>>>
>>>> Safety is important in that it allows you to limit the possible effects
>>>> of bugs. Getting programs bug-free is of course a very difficult problem
>>>> no matter what you do.
>>>>
>>>> But no I don't think warnings for uninitialized use of variables is
>>>> something one should do in general in Modula-3. The safety guarantees
>>>> lead to different idioms from in C---not so many compiler warnings
>>>> are required to get your code right. And if you do screw up
>>>> an uninitialized variable=2C the effect is going to be limited.
>>>> Unlike what happens in C=2C where if you screw up the initialization
>>>> of a pointer=2C for instance=2C all hell breaks loose.
>>>>
>>>> Your example is contrived. Usually the code looks like this
>>>>
>>>> VAR
>>>> x : T=3B
>>>> BEGIN
>>>> IF cond1 THEN x :=3D ... END=3B
>>>> ...
>>>> IF cond2 THEN (* use x *) END
>>>> END
>>>>
>>>> where the programmer knows that cond2 logically implies cond1.
>>>>
>>>> I think the presence of VAR parameter passing makes these sorts of
>>>> warnings also less useful in Modula-3. Is the following OK?
>>>>
>>>> PROCEDURE InitializeIt(VAR a : T)=3B
>>>> PROCEUDRE UseIt(VALUE a : T)=3B
>>>>
>>>> VAR x : T=3B
>>>> BEGIN
>>>> InitializeIt(x)=3B
>>>> UseIt(x)
>>>> END
>>>>
>>>> I would think your compiler can't prove that x is initialized. Warning
>>>> or not? I say no: this is actually very reasonable Modula-3 code.
>>>> But then do you want a warning for the IF? It's logically the same.
>>>>
>>>> There's a chapter in
>>>> EWD's Discipline of Programming that deals with the problem in
>>>> detail. I think he winds up with six different "modes" for variables.
>>>>
>>>> Mika
>>>>
>>>> Jay K writes:
>>>>>
>>>>> ok=3D2C so in C:
>>>>> =3D20
>>>>> int F()
>>>>> {
>>>>> int i=3D3B
>>>>> return i=3D3B
>>>>> }
>>>>> =3D20
>>>>> should warn or not?
>>>>> Prevailing wisdom is definitely.
>>>>> Main known exception is code to generate random numbers.
>>>>> I believe this is how Debian broke key generation.
>>>>> =3D20
>>>>> =3D20
>>>>> Modula-3:
>>>>> =3D20
>>>>> =3D20
>>>>> PROCEDURE F(): INTEGER =3D3D=3D20
>>>>> VAR i: INTEGER=3D3B
>>>>> BEGIN
>>>>> RETURN i=3D3B
>>>>> END F=3D3B
>>>>> =3D20
>>>>> =3D20
>>>>> Should warn or not?
>>>>> Since this identical to the C=3D2C prevailing wisdom is definitely.
>>>>> =3D20
>>>>> =3D20
>>>>> They are=3D2C I guess=3D2C "safe"=3D2C but most likely=3D2C incorrect.
>>>>> =3D20
>>>>> =3D20
>>>>> The compiler may have made "safety" guarantees=3D2C and they are signific=
>>> ant=3D
>>>>> =3D2C
>>>>> but safe is far from correct=3D2C and however smart the compiler can be t=
>>> o lo=3D
>>>>> ok for correctness issues=3D2C is also nice.
>>>>> =3D20
>>>>> =3D20
>>>>> =3D20
>>>>> (Friend of mine conjectured something like: Safety guarantees have people=
>>> d=3D
>>>>> eluded. Software will still have plenty of bugs and be plenty difficult t=
>>> o =3D
>>>>> get correct and require plenty of testing. Safety guarantees help=3D2C bu=
>>> t th=3D
>>>>> ey are only a small step toward actual correctness.)
>>>>> =3D20
>>>>> =3D20
>>>>> =3D20
>>>>> - Jay
>>>>>
>>>>>
>>>>> ----------------------------------------
>>>>>> Subject: Re: [M3commit] CVS Update: cm3
>>>>>> From: hosking at cs.purdue.edu
>>>>>> Date: Tue=3D2C 1 Jun 2010 20:04:00 -0400
>>>>>> CC: jkrell at elego.de=3D3B m3commit at elegosoft.com
>>>>>> To: jay.krell at cornell.edu
>>>>>>
>>>>>> Sure=3D2C an INTEGER is a valid value whatever the bits.
>>>>>>
>>>>>>
>>>>>> On 1 Jun 2010=3D2C at 17:44=3D2C Jay K wrote:
>>>>>>
>>>>>>>
>>>>>>> Start removing the rampant use of volatile in the backend and these wa=
>>> rn=3D
>>>>> ings show up.
>>>>>>>
>>>>>>> Volatile quashes the uninitialized checks in the backend.
>>>>>>>
>>>>>>> Is it really ok for an INTEGER to be uninitialized? I realize it conta=
>>> in=3D
>>>>> s an "integer" value=3D2C as all bit patterns are.
>>>>>>>
>>>>>>> Some of these really do seem like bugs. Some do not.
>>>>>>> I'll try making fault_proc noreturn=3D2C which should remove some of t=
>>> hem.
>>>>>>>
>>>>>>>
>>>>>>> - Jay
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ----------------------------------------
>>>>>>>> From: hosking at cs.purdue.edu
>>>>>>>> To: jkrell at elego.de
>>>>>>>> Date: Tue=3D2C 1 Jun 2010 16:29:20 -0500
>>>>>>>> CC: m3commit at elegosoft.com
>>>>>>>> Subject: Re: [M3commit] CVS Update: cm3
>>>>>>>>
>>>>>>>> This is bogus. The M3 compiler guarantees all variables are initializ=
>>> ed=3D
>>>>> .
>>>>>>>>
>>>>>>>> Sent from my iPhone
>>>>>>>>
>>>>>>>> On Jun 1=3D2C 2010=3D2C at 2:42 PM=3D2C jkrell at elego.de (Jay Krell) w=
>>> rote:
>>>>>>>>
>>>>>>>>> CVSROOT: /usr/cvs
>>>>>>>>> Changes by: jkrell at birch. 10/06/01 14:42:00
>>>>>>>>>
>>>>>>>>> Modified files:
>>>>>>>>> cm3/m3-libs/m3core/src/convert/: Convert.m3
>>>>>>>>>
>>>>>>>>> Log message:
>>>>>>>>> initialize locals=3D3B I get warnings that some not quite all=3D2C a=
>>> re
>>>>>>>>> used uninitialized if I remove the volatile/sideeffects on every
>>>>>>>>> load/store in parse.c
>>>>>> =3D
>>> =
>
More information about the M3devel
mailing list