[M3devel] warning for uninitialized variables?

Jay K jay.krell at cornell.edu
Wed Jun 2 06:50:28 CEST 2010


Wow. You really surprise me Mika.


To warn on this kind of thing, to me, there is no question.


Limting the affects by virtue of being safe isn't much consolation.
The program can easily go down a "safe" but very incorrect path.


There are several forms.
In fact, sometimes the compiler says something "is" used uninitialized.
Sometimes it says "maybe".


In your example, generally, once the address is taken of a variable,
the compiler gives up. Esp. if that address is passed to a function it doesn't "see".


However compilers are getting better.
Windows headers are now annotated with "in" and "out", etc. with
a calculus behind them, not just informal human definitions.


So if you pass something by pointer to "in" you should a warning/error.
If it is "out", ok.


The annotations can even be checked, if you write to something that is "in"
you should get a warning/error.


There are many more warnings in the tree than I have yet looked at.
I've seen a few forms.
Some are because the compiler doesn't know that some functions don't return.
I'll see about improving that. Though I think there's a limit to how good I can manage,
unless we add a <* noreturn *> pragma.


One form is:
a: INTEGER
if cond
  a := 1;
...
if cond
  use a
 

When the compiler is really unsure, it doesn't bother saying anything.


If code is lucky, it gets one smart human to study it carefully when it is first written.
However all code is lucky to have a smart and generally getting smarter compiler
look at it repeatedly.
We shouldn't throw away this tool so rapidly, the gcc optimizer and its data flow analysis.


I grant that by initializing to 0, I haven't necessarily made the code correct either.
But at least it is now consistent.


Even better would be runtime checks that stop if a variable is read before written.


See also the rules in Java and C# that require the compiler to be able to prove
a variable is written before read. Possibly even verifiable at load time.


Other forms look related to going "through" a switch statement with no arms handling the value.
I think I can address that by marking fault_proc as noreturn.


One form apparently in cm3ide would use uninitialized data if presented with a malformed file.
Good software doesn't depend on the well formedness of its input.

There are many more warnings like this in the tree.


My inclination is at least to temporarily hardcode the gcc backend to always optimize,
and always produce these warnings. People can look over them maybe and decide.
Or they can look over my "fixes".


Or I guess if people really really really really prefer, we can always turn off the warnings
and let the code lie. I really think that is the wrong answer.


I have been burned too much by uninitialized locals in C.
I put " = { 0 }" after everything.
Again, that isn't necessarily "correct", but at least the incorrect paths are consistent.
If they don't work and I happen down them, they will guaranteeably not work.


Uninitialized values can be different run to run.
Repeatability and consistency are important.


I'm also nervous about us not taking a hard line on integer overflow.
Overflown math doesn't necessarily lead to array out of bounds soon or ever.
It too can lead to incorrect but safe behavior.


 - Jay


----------------------------------------
> To: jay.krell at cornell.edu
> Date: Tue, 1 Jun 2010 20:52:26 -0700
> From: mika at async.async.caltech.edu
> CC: m3devel at elegosoft.com
> Subject: Re: [M3devel] [M3commit] CVS Update: cm3
>
>
> Safety is important in that it allows you to limit the possible effects
> of bugs. Getting programs bug-free is of course a very difficult problem
> no matter what you do.
>
> But no I don't think warnings for uninitialized use of variables is
> something one should do in general in Modula-3. The safety guarantees
> lead to different idioms from in C---not so many compiler warnings
> are required to get your code right. And if you do screw up
> an uninitialized variable, the effect is going to be limited.
> Unlike what happens in C, where if you screw up the initialization
> of a pointer, for instance, all hell breaks loose.
>
> Your example is contrived. Usually the code looks like this
>
> VAR
> x : T;
> BEGIN
> IF cond1 THEN x := ... END;
> ...
> IF cond2 THEN (* use x *) END
> END
>
> where the programmer knows that cond2 logically implies cond1.
>
> I think the presence of VAR parameter passing makes these sorts of
> warnings also less useful in Modula-3. Is the following OK?
>
> PROCEDURE InitializeIt(VAR a : T);
> PROCEUDRE UseIt(VALUE a : T);
>
> VAR x : T;
> BEGIN
> InitializeIt(x);
> UseIt(x)
> END
>
> I would think your compiler can't prove that x is initialized. Warning
> or not? I say no: this is actually very reasonable Modula-3 code.
> But then do you want a warning for the IF? It's logically the same.
>
> There's a chapter in
> EWD's Discipline of Programming that deals with the problem in
> detail. I think he winds up with six different "modes" for variables.
>
> Mika
>
> Jay K writes:
>>
>>ok=2C so in C:
>>=20
>>int F()
>>{
>> int i=3B
>> return i=3B
>>}
>>=20
>>should warn or not?
>>Prevailing wisdom is definitely.
>>Main known exception is code to generate random numbers.
>> I believe this is how Debian broke key generation.
>>=20
>>=20
>>Modula-3:
>>=20
>>=20
>>PROCEDURE F(): INTEGER =3D=20
>> VAR i: INTEGER=3B
>>BEGIN
>> RETURN i=3B
>>END F=3B
>>=20
>>=20
>>Should warn or not?
>>Since this identical to the C=2C prevailing wisdom is definitely.
>>=20
>>=20
>>They are=2C I guess=2C "safe"=2C but most likely=2C incorrect.
>>=20
>>=20
>>The compiler may have made "safety" guarantees=2C and they are significant=
>>=2C
>>but safe is far from correct=2C and however smart the compiler can be to lo=
>>ok for correctness issues=2C is also nice.
>>=20
>>=20
>>=20
>>(Friend of mine conjectured something like: Safety guarantees have people d=
>>eluded. Software will still have plenty of bugs and be plenty difficult to =
>>get correct and require plenty of testing. Safety guarantees help=2C but th=
>>ey are only a small step toward actual correctness.)
>>=20
>>=20
>>=20
>> - Jay
>>
>>
>>----------------------------------------
>>> Subject: Re: [M3commit] CVS Update: cm3
>>> From: hosking at cs.purdue.edu
>>> Date: Tue=2C 1 Jun 2010 20:04:00 -0400
>>> CC: jkrell at elego.de=3B m3commit at elegosoft.com
>>> To: jay.krell at cornell.edu
>>>
>>> Sure=2C an INTEGER is a valid value whatever the bits.
>>>
>>>
>>> On 1 Jun 2010=2C at 17:44=2C Jay K wrote:
>>>
>>>>
>>>> Start removing the rampant use of volatile in the backend and these warn=
>>ings show up.
>>>>
>>>> Volatile quashes the uninitialized checks in the backend.
>>>>
>>>> Is it really ok for an INTEGER to be uninitialized? I realize it contain=
>>s an "integer" value=2C as all bit patterns are.
>>>>
>>>> Some of these really do seem like bugs. Some do not.
>>>> I'll try making fault_proc noreturn=2C which should remove some of them.
>>>>
>>>>
>>>> - Jay
>>>>
>>>>
>>>>
>>>>
>>>> ----------------------------------------
>>>>> From: hosking at cs.purdue.edu
>>>>> To: jkrell at elego.de
>>>>> Date: Tue=2C 1 Jun 2010 16:29:20 -0500
>>>>> CC: m3commit at elegosoft.com
>>>>> Subject: Re: [M3commit] CVS Update: cm3
>>>>>
>>>>> This is bogus. The M3 compiler guarantees all variables are initialized=
>>.
>>>>>
>>>>> Sent from my iPhone
>>>>>
>>>>> On Jun 1=2C 2010=2C at 2:42 PM=2C jkrell at elego.de (Jay Krell) wrote:
>>>>>
>>>>>> CVSROOT: /usr/cvs
>>>>>> Changes by: jkrell at birch. 10/06/01 14:42:00
>>>>>>
>>>>>> Modified files:
>>>>>> cm3/m3-libs/m3core/src/convert/: Convert.m3
>>>>>>
>>>>>> Log message:
>>>>>> initialize locals=3B I get warnings that some not quite all=2C are
>>>>>> used uninitialized if I remove the volatile/sideeffects on every
>>>>>> load/store in parse.c
>>> =
 		 	   		  


More information about the M3devel mailing list