[M3devel] warning for uninitialized variables?

Mika Nystrom mika at async.async.caltech.edu
Wed Jun 2 14:44:22 CEST 2010


Jay K writes:
>
>Wow. You really surprise me Mika.
>
>
>To warn on this kind of thing=2C to me=2C there is no question.
>
>
>Limting the affects by virtue of being safe isn't much consolation.
>The program can easily go down a "safe" but very incorrect path.

It is, though!  Do you always know what value to initialize the
variables to?  If not, you're just picking some value to shut up
the compiler, and now your code no longer expresses---to a human
reader---that x is uninitialized.

Remember, code is mainly written for humans!  Computers don't care
about fancy structuring, declarations, etc.  When you write a program,
you write it to be read by a human.  The fact that a variable is
uninitialized is a part of the design of the program.

It's an unfortunate fact that our languages don't have all the 
mechanisms necessary to deal properly with uninitialized variables.
Retrofitting mechanisms to something that just isn't expressive
enough to solve the problem I really don't think is the way to go.

EWD's book chapter... read it.  "An essay on the scope of variables"
I think it is called.


>
>
>There are several forms.
>In fact=2C sometimes the compiler says something "is" used uninitialized.
>Sometimes it says "maybe".
>
>
>In your example=2C generally=2C once the address is taken of a variable=2C
>the compiler gives up. Esp. if that address is passed to a function it does=
>n't "see".

As long as this doesn't raise warnings I can live with your warnings...

...
>
>
>There are many more warnings in the tree than I have yet looked at.
>I've seen a few forms.
>Some are because the compiler doesn't know that some functions don't return=
>.
>I'll see about improving that. Though I think there's a limit to how good I=
> can manage=2C
>unless we add a <* noreturn *> pragma.

adding <*ASSERT FALSE*> in the right places might fix these warnings.

...
>
>
>I grant that by initializing to 0=2C I haven't necessarily made the code co=
>rrect either.
>But at least it is now consistent.
>
>
>Even better would be runtime checks that stop if a variable is read before =
>written.

Which you have just defeated with your initialization!

I know that my Modula code will outlive its current compiler.  Probably
by a long time.  Once you add the initialization you're talking about,
you can't go back.  It no longer expresses the useful property you're
trying to check.  When the new compiler comes along with its runtime
check, it will miss that my variables are actually uninitialized.

>
>
>See also the rules in Java and C# that require the compiler to be able to p=
>rove
>a variable is written before read. Possibly even verifiable at load time.

I *hate* this.  See above.  

What's next?  The compiler refuses to compile loops that it can't
prove will terminate???  Hogwash!

>
>
>Other forms look related to going "through" a switch statement with no arms=
> handling the value.
>I think I can address that by marking fault_proc as noreturn.
>
>
>One form apparently in cm3ide would use uninitialized data if presented wit=
>h a malformed file.
>Good software doesn't depend on the well formedness of its input.
>
>There are many more warnings like this in the tree.
>
>
>My inclination is at least to temporarily hardcode the gcc backend to alway=
>s optimize=2C
>and always produce these warnings. People can look over them maybe and deci=
>de.
>Or they can look over my "fixes".
>
>
>Or I guess if people really really really really prefer=2C we can always tu=
>rn off the warnings
>and let the code lie. I really think that is the wrong answer.
>
>
>I have been burned too much by uninitialized locals in C.
>I put " =3D { 0 }" after everything.
>Again=2C that isn't necessarily "correct"=2C but at least the incorrect pat=
>hs are consistent.
>If they don't work and I happen down them=2C they will guaranteeably not wo=
>rk.

No they will just consistently not work.  Not "guaranteeably".  You have
to initialize them to some specific value for that to be the case.
Using "0" often will just give you the wrong answer!  

>
>
>Uninitialized values can be different run to run.
>Repeatability and consistency are important.

Not forcing the programmer to obscure the meaning of the code I think
is more important.

>
>
>I'm also nervous about us not taking a hard line on integer overflow.
>Overflown math doesn't necessarily lead to array out of bounds soon or ever=
>.
>It too can lead to incorrect but safe behavior.

Similar situation is it not?  You don't generate a compiler warning
for math that "might overflow".  

I think we can all agree that error on integer overflow, error on
use of uninitialized variable, at runtime, would both be good things?

     Mika

>
>
>=A0- Jay
>
>
>----------------------------------------
>> To: jay.krell at cornell.edu
>> Date: Tue=2C 1 Jun 2010 20:52:26 -0700
>> From: mika at async.async.caltech.edu
>> CC: m3devel at elegosoft.com
>> Subject: Re: [M3devel] [M3commit] CVS Update: cm3
>>
>>
>> Safety is important in that it allows you to limit the possible effects
>> of bugs. Getting programs bug-free is of course a very difficult problem
>> no matter what you do.
>>
>> But no I don't think warnings for uninitialized use of variables is
>> something one should do in general in Modula-3. The safety guarantees
>> lead to different idioms from in C---not so many compiler warnings
>> are required to get your code right. And if you do screw up
>> an uninitialized variable=2C the effect is going to be limited.
>> Unlike what happens in C=2C where if you screw up the initialization
>> of a pointer=2C for instance=2C all hell breaks loose.
>>
>> Your example is contrived. Usually the code looks like this
>>
>> VAR
>> x : T=3B
>> BEGIN
>> IF cond1 THEN x :=3D ... END=3B
>> ...
>> IF cond2 THEN (* use x *) END
>> END
>>
>> where the programmer knows that cond2 logically implies cond1.
>>
>> I think the presence of VAR parameter passing makes these sorts of
>> warnings also less useful in Modula-3. Is the following OK?
>>
>> PROCEDURE InitializeIt(VAR a : T)=3B
>> PROCEUDRE UseIt(VALUE a : T)=3B
>>
>> VAR x : T=3B
>> BEGIN
>> InitializeIt(x)=3B
>> UseIt(x)
>> END
>>
>> I would think your compiler can't prove that x is initialized. Warning
>> or not? I say no: this is actually very reasonable Modula-3 code.
>> But then do you want a warning for the IF? It's logically the same.
>>
>> There's a chapter in
>> EWD's Discipline of Programming that deals with the problem in
>> detail. I think he winds up with six different "modes" for variables.
>>
>> Mika
>>
>> Jay K writes:
>>>
>>>ok=3D2C so in C:
>>>=3D20
>>>int F()
>>>{
>>> int i=3D3B
>>> return i=3D3B
>>>}
>>>=3D20
>>>should warn or not?
>>>Prevailing wisdom is definitely.
>>>Main known exception is code to generate random numbers.
>>> I believe this is how Debian broke key generation.
>>>=3D20
>>>=3D20
>>>Modula-3:
>>>=3D20
>>>=3D20
>>>PROCEDURE F(): INTEGER =3D3D=3D20
>>> VAR i: INTEGER=3D3B
>>>BEGIN
>>> RETURN i=3D3B
>>>END F=3D3B
>>>=3D20
>>>=3D20
>>>Should warn or not?
>>>Since this identical to the C=3D2C prevailing wisdom is definitely.
>>>=3D20
>>>=3D20
>>>They are=3D2C I guess=3D2C "safe"=3D2C but most likely=3D2C incorrect.
>>>=3D20
>>>=3D20
>>>The compiler may have made "safety" guarantees=3D2C and they are signific=
>ant=3D
>>>=3D2C
>>>but safe is far from correct=3D2C and however smart the compiler can be t=
>o lo=3D
>>>ok for correctness issues=3D2C is also nice.
>>>=3D20
>>>=3D20
>>>=3D20
>>>(Friend of mine conjectured something like: Safety guarantees have people=
> d=3D
>>>eluded. Software will still have plenty of bugs and be plenty difficult t=
>o =3D
>>>get correct and require plenty of testing. Safety guarantees help=3D2C bu=
>t th=3D
>>>ey are only a small step toward actual correctness.)
>>>=3D20
>>>=3D20
>>>=3D20
>>> - Jay
>>>
>>>
>>>----------------------------------------
>>>> Subject: Re: [M3commit] CVS Update: cm3
>>>> From: hosking at cs.purdue.edu
>>>> Date: Tue=3D2C 1 Jun 2010 20:04:00 -0400
>>>> CC: jkrell at elego.de=3D3B m3commit at elegosoft.com
>>>> To: jay.krell at cornell.edu
>>>>
>>>> Sure=3D2C an INTEGER is a valid value whatever the bits.
>>>>
>>>>
>>>> On 1 Jun 2010=3D2C at 17:44=3D2C Jay K wrote:
>>>>
>>>>>
>>>>> Start removing the rampant use of volatile in the backend and these wa=
>rn=3D
>>>ings show up.
>>>>>
>>>>> Volatile quashes the uninitialized checks in the backend.
>>>>>
>>>>> Is it really ok for an INTEGER to be uninitialized? I realize it conta=
>in=3D
>>>s an "integer" value=3D2C as all bit patterns are.
>>>>>
>>>>> Some of these really do seem like bugs. Some do not.
>>>>> I'll try making fault_proc noreturn=3D2C which should remove some of t=
>hem.
>>>>>
>>>>>
>>>>> - Jay
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ----------------------------------------
>>>>>> From: hosking at cs.purdue.edu
>>>>>> To: jkrell at elego.de
>>>>>> Date: Tue=3D2C 1 Jun 2010 16:29:20 -0500
>>>>>> CC: m3commit at elegosoft.com
>>>>>> Subject: Re: [M3commit] CVS Update: cm3
>>>>>>
>>>>>> This is bogus. The M3 compiler guarantees all variables are initializ=
>ed=3D
>>>.
>>>>>>
>>>>>> Sent from my iPhone
>>>>>>
>>>>>> On Jun 1=3D2C 2010=3D2C at 2:42 PM=3D2C jkrell at elego.de (Jay Krell) w=
>rote:
>>>>>>
>>>>>>> CVSROOT: /usr/cvs
>>>>>>> Changes by: jkrell at birch. 10/06/01 14:42:00
>>>>>>>
>>>>>>> Modified files:
>>>>>>> cm3/m3-libs/m3core/src/convert/: Convert.m3
>>>>>>>
>>>>>>> Log message:
>>>>>>> initialize locals=3D3B I get warnings that some not quite all=3D2C a=
>re
>>>>>>> used uninitialized if I remove the volatile/sideeffects on every
>>>>>>> load/store in parse.c
>>>> =3D
> 		 	   		  =




More information about the M3devel mailing list