[M3devel] warning for uninitialized variables?

Tony Hosking hosking at cs.purdue.edu
Wed Jun 2 16:41:47 CEST 2010


Hear, hear!

On 2 Jun 2010, at 08:44, Mika Nystrom wrote:

> Jay K writes:
>> 
>> Wow. You really surprise me Mika.
>> 
>> 
>> To warn on this kind of thing=2C to me=2C there is no question.
>> 
>> 
>> Limting the affects by virtue of being safe isn't much consolation.
>> The program can easily go down a "safe" but very incorrect path.
> 
> It is, though!  Do you always know what value to initialize the
> variables to?  If not, you're just picking some value to shut up
> the compiler, and now your code no longer expresses---to a human
> reader---that x is uninitialized.
> 
> Remember, code is mainly written for humans!  Computers don't care
> about fancy structuring, declarations, etc.  When you write a program,
> you write it to be read by a human.  The fact that a variable is
> uninitialized is a part of the design of the program.
> 
> It's an unfortunate fact that our languages don't have all the 
> mechanisms necessary to deal properly with uninitialized variables.
> Retrofitting mechanisms to something that just isn't expressive
> enough to solve the problem I really don't think is the way to go.
> 
> EWD's book chapter... read it.  "An essay on the scope of variables"
> I think it is called.
> 
> 
>> 
>> 
>> There are several forms.
>> In fact=2C sometimes the compiler says something "is" used uninitialized.
>> Sometimes it says "maybe".
>> 
>> 
>> In your example=2C generally=2C once the address is taken of a variable=2C
>> the compiler gives up. Esp. if that address is passed to a function it does=
>> n't "see".
> 
> As long as this doesn't raise warnings I can live with your warnings...
> 
> ...
>> 
>> 
>> There are many more warnings in the tree than I have yet looked at.
>> I've seen a few forms.
>> Some are because the compiler doesn't know that some functions don't return=
>> .
>> I'll see about improving that. Though I think there's a limit to how good I=
>> can manage=2C
>> unless we add a <* noreturn *> pragma.
> 
> adding <*ASSERT FALSE*> in the right places might fix these warnings.
> 
> ...
>> 
>> 
>> I grant that by initializing to 0=2C I haven't necessarily made the code co=
>> rrect either.
>> But at least it is now consistent.
>> 
>> 
>> Even better would be runtime checks that stop if a variable is read before =
>> written.
> 
> Which you have just defeated with your initialization!
> 
> I know that my Modula code will outlive its current compiler.  Probably
> by a long time.  Once you add the initialization you're talking about,
> you can't go back.  It no longer expresses the useful property you're
> trying to check.  When the new compiler comes along with its runtime
> check, it will miss that my variables are actually uninitialized.
> 
>> 
>> 
>> See also the rules in Java and C# that require the compiler to be able to p=
>> rove
>> a variable is written before read. Possibly even verifiable at load time.
> 
> I *hate* this.  See above.  
> 
> What's next?  The compiler refuses to compile loops that it can't
> prove will terminate???  Hogwash!
> 
>> 
>> 
>> Other forms look related to going "through" a switch statement with no arms=
>> handling the value.
>> I think I can address that by marking fault_proc as noreturn.
>> 
>> 
>> One form apparently in cm3ide would use uninitialized data if presented wit=
>> h a malformed file.
>> Good software doesn't depend on the well formedness of its input.
>> 
>> There are many more warnings like this in the tree.
>> 
>> 
>> My inclination is at least to temporarily hardcode the gcc backend to alway=
>> s optimize=2C
>> and always produce these warnings. People can look over them maybe and deci=
>> de.
>> Or they can look over my "fixes".
>> 
>> 
>> Or I guess if people really really really really prefer=2C we can always tu=
>> rn off the warnings
>> and let the code lie. I really think that is the wrong answer.
>> 
>> 
>> I have been burned too much by uninitialized locals in C.
>> I put " =3D { 0 }" after everything.
>> Again=2C that isn't necessarily "correct"=2C but at least the incorrect pat=
>> hs are consistent.
>> If they don't work and I happen down them=2C they will guaranteeably not wo=
>> rk.
> 
> No they will just consistently not work.  Not "guaranteeably".  You have
> to initialize them to some specific value for that to be the case.
> Using "0" often will just give you the wrong answer!  
> 
>> 
>> 
>> Uninitialized values can be different run to run.
>> Repeatability and consistency are important.
> 
> Not forcing the programmer to obscure the meaning of the code I think
> is more important.
> 
>> 
>> 
>> I'm also nervous about us not taking a hard line on integer overflow.
>> Overflown math doesn't necessarily lead to array out of bounds soon or ever=
>> .
>> It too can lead to incorrect but safe behavior.
> 
> Similar situation is it not?  You don't generate a compiler warning
> for math that "might overflow".  
> 
> I think we can all agree that error on integer overflow, error on
> use of uninitialized variable, at runtime, would both be good things?
> 
>     Mika
> 
>> 
>> 
>> =A0- Jay
>> 
>> 
>> ----------------------------------------
>>> To: jay.krell at cornell.edu
>>> Date: Tue=2C 1 Jun 2010 20:52:26 -0700
>>> From: mika at async.async.caltech.edu
>>> CC: m3devel at elegosoft.com
>>> Subject: Re: [M3devel] [M3commit] CVS Update: cm3
>>> 
>>> 
>>> Safety is important in that it allows you to limit the possible effects
>>> of bugs. Getting programs bug-free is of course a very difficult problem
>>> no matter what you do.
>>> 
>>> But no I don't think warnings for uninitialized use of variables is
>>> something one should do in general in Modula-3. The safety guarantees
>>> lead to different idioms from in C---not so many compiler warnings
>>> are required to get your code right. And if you do screw up
>>> an uninitialized variable=2C the effect is going to be limited.
>>> Unlike what happens in C=2C where if you screw up the initialization
>>> of a pointer=2C for instance=2C all hell breaks loose.
>>> 
>>> Your example is contrived. Usually the code looks like this
>>> 
>>> VAR
>>> x : T=3B
>>> BEGIN
>>> IF cond1 THEN x :=3D ... END=3B
>>> ...
>>> IF cond2 THEN (* use x *) END
>>> END
>>> 
>>> where the programmer knows that cond2 logically implies cond1.
>>> 
>>> I think the presence of VAR parameter passing makes these sorts of
>>> warnings also less useful in Modula-3. Is the following OK?
>>> 
>>> PROCEDURE InitializeIt(VAR a : T)=3B
>>> PROCEUDRE UseIt(VALUE a : T)=3B
>>> 
>>> VAR x : T=3B
>>> BEGIN
>>> InitializeIt(x)=3B
>>> UseIt(x)
>>> END
>>> 
>>> I would think your compiler can't prove that x is initialized. Warning
>>> or not? I say no: this is actually very reasonable Modula-3 code.
>>> But then do you want a warning for the IF? It's logically the same.
>>> 
>>> There's a chapter in
>>> EWD's Discipline of Programming that deals with the problem in
>>> detail. I think he winds up with six different "modes" for variables.
>>> 
>>> Mika
>>> 
>>> Jay K writes:
>>>> 
>>>> ok=3D2C so in C:
>>>> =3D20
>>>> int F()
>>>> {
>>>> int i=3D3B
>>>> return i=3D3B
>>>> }
>>>> =3D20
>>>> should warn or not?
>>>> Prevailing wisdom is definitely.
>>>> Main known exception is code to generate random numbers.
>>>> I believe this is how Debian broke key generation.
>>>> =3D20
>>>> =3D20
>>>> Modula-3:
>>>> =3D20
>>>> =3D20
>>>> PROCEDURE F(): INTEGER =3D3D=3D20
>>>> VAR i: INTEGER=3D3B
>>>> BEGIN
>>>> RETURN i=3D3B
>>>> END F=3D3B
>>>> =3D20
>>>> =3D20
>>>> Should warn or not?
>>>> Since this identical to the C=3D2C prevailing wisdom is definitely.
>>>> =3D20
>>>> =3D20
>>>> They are=3D2C I guess=3D2C "safe"=3D2C but most likely=3D2C incorrect.
>>>> =3D20
>>>> =3D20
>>>> The compiler may have made "safety" guarantees=3D2C and they are signific=
>> ant=3D
>>>> =3D2C
>>>> but safe is far from correct=3D2C and however smart the compiler can be t=
>> o lo=3D
>>>> ok for correctness issues=3D2C is also nice.
>>>> =3D20
>>>> =3D20
>>>> =3D20
>>>> (Friend of mine conjectured something like: Safety guarantees have people=
>> d=3D
>>>> eluded. Software will still have plenty of bugs and be plenty difficult t=
>> o =3D
>>>> get correct and require plenty of testing. Safety guarantees help=3D2C bu=
>> t th=3D
>>>> ey are only a small step toward actual correctness.)
>>>> =3D20
>>>> =3D20
>>>> =3D20
>>>> - Jay
>>>> 
>>>> 
>>>> ----------------------------------------
>>>>> Subject: Re: [M3commit] CVS Update: cm3
>>>>> From: hosking at cs.purdue.edu
>>>>> Date: Tue=3D2C 1 Jun 2010 20:04:00 -0400
>>>>> CC: jkrell at elego.de=3D3B m3commit at elegosoft.com
>>>>> To: jay.krell at cornell.edu
>>>>> 
>>>>> Sure=3D2C an INTEGER is a valid value whatever the bits.
>>>>> 
>>>>> 
>>>>> On 1 Jun 2010=3D2C at 17:44=3D2C Jay K wrote:
>>>>> 
>>>>>> 
>>>>>> Start removing the rampant use of volatile in the backend and these wa=
>> rn=3D
>>>> ings show up.
>>>>>> 
>>>>>> Volatile quashes the uninitialized checks in the backend.
>>>>>> 
>>>>>> Is it really ok for an INTEGER to be uninitialized? I realize it conta=
>> in=3D
>>>> s an "integer" value=3D2C as all bit patterns are.
>>>>>> 
>>>>>> Some of these really do seem like bugs. Some do not.
>>>>>> I'll try making fault_proc noreturn=3D2C which should remove some of t=
>> hem.
>>>>>> 
>>>>>> 
>>>>>> - Jay
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> ----------------------------------------
>>>>>>> From: hosking at cs.purdue.edu
>>>>>>> To: jkrell at elego.de
>>>>>>> Date: Tue=3D2C 1 Jun 2010 16:29:20 -0500
>>>>>>> CC: m3commit at elegosoft.com
>>>>>>> Subject: Re: [M3commit] CVS Update: cm3
>>>>>>> 
>>>>>>> This is bogus. The M3 compiler guarantees all variables are initializ=
>> ed=3D
>>>> .
>>>>>>> 
>>>>>>> Sent from my iPhone
>>>>>>> 
>>>>>>> On Jun 1=3D2C 2010=3D2C at 2:42 PM=3D2C jkrell at elego.de (Jay Krell) w=
>> rote:
>>>>>>> 
>>>>>>>> CVSROOT: /usr/cvs
>>>>>>>> Changes by: jkrell at birch. 10/06/01 14:42:00
>>>>>>>> 
>>>>>>>> Modified files:
>>>>>>>> cm3/m3-libs/m3core/src/convert/: Convert.m3
>>>>>>>> 
>>>>>>>> Log message:
>>>>>>>> initialize locals=3D3B I get warnings that some not quite all=3D2C a=
>> re
>>>>>>>> used uninitialized if I remove the volatile/sideeffects on every
>>>>>>>> load/store in parse.c
>>>>> =3D
>> 		 	   		  =




More information about the M3devel mailing list